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Chapter 1 

Introduction 


Physical theories are usually formulated as differential equations. Some such examples 
are Hamilton’s equation of motion (describing the classical motion of a point particle) 


d 

dt 




( 1 . 0 . 1 ) 


where = (dq^V, ...,dq^V)is the gradient of the potential V, the Schrddinger equation 


where = Xj=i Laplace operator, the wave equation 


d^u — A^u = 0 , 

the heat equation (with source) 


dtU = +A^u+f(t) 

and the Maxwell equations (V^ x E is the rotation of E) 

A ("e V r- r 

dt xEj 0 j 

[oj- 


( 1 . 0 . 2 ) 

(1.0.3) 

(1.0.4) 

(1.0.5) 

( 1 . 0 . 6 ) 


Of course, there are many, many more examples, so it helps to classify them systemat¬ 
ically. In the zoology of differential equations, the first distinction we need to make is 
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1 Introduction 


between ordinary dijferential equations (ODEs) and partial dijferential equations (PDEs), 
i. e. differential equations which involve derivatives of only one variable (e. g. ( 1 . 0 . 1 ) ) or 
several variables (all other examples). The order of a differential equation is determined 
by the highest derivative involved in it. 

On a second axis, we need to distinguish between non-linear (e. g. (1.0.1) forH(q,p) = 
+ V(q) where V is not a second-order polynomial) or linear equations (e. g. (1.0.1) 
for V(q) = co^q^ or (1.0.6) ). 

We can write any differential equation in the form 

L{u) = f 

where the contribution L{u) which depends on the solution u is collected on the left-hand 
side and the u-independent part / is on the right-hand side. 

Linear differential equations are the distinguished case where the operator L satisfies 

L (UiUj -I- a2U2) = L(ui) -I- 02 L{U2) (1.0.7) 

for all scalars 0^,02 and suitable functions otherwise, the differential equation is 

non-linear. Among linear differential equations we further distinguish between homoge¬ 
neous (/ = 0 ) and inhomogeneous (/ 7 ^ 0 ) linear differential equations. 

Solving linear differential equations is much, much easier, because linear combinations 
of solutions of the homogeneous equation L(u) = 0 are once again solutions of the homo¬ 
geneous equation. In other words, the solutions form a vector space. This makes it easier 
to find solutions which satisfy the correct initial conditions by, for instance, systematically 
finding all solutions to L(u) = 0 and then forming suitable linear combinations. 

However, there are cases when solving a non-linear problem may be more desirable. In 
case of many-particle quantum mechanics, a non-linear problem on a lower-dimensional 
space is often preferable to a high-dimensional linear problem. 

Secondly, one can often relate easier-to-solve ordinary differential equations to partial 
differential equations in a systematic fashion, e. g. by means of semiclassical limits which 
relate (1.0.1) and (1.0.2) , by “diagonalizing” a PDE or considering an associated “eigen¬ 
value problem”. 

A last important weapon in the arsenal of a mathematical physicist is to systematically 
exploit symmetries. 

Well-posedness The most fundamental question one may ask about a differential equa¬ 
tion on a domain of functions (which may include boundary conditions and such) is 
whether it is a well-posed problem, i. e. 

( 1 ) whether a (non-trivial) solution exists, 

( 2 ) whether the solution is unique and 
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(3) whether the solution depends on the initial conditions in a continuous fashion (sta¬ 
bility). 

Solving this sounds rather like an exercise, but it can be tremendously hard. (Proving 
the well-posedness of the Navier-Stokes equation is one of the - unsolved - Millenium 
Prize Problems, for instance.) Indeed, there are countless examples where a differential 
equation either has no non-trivial solution^ or the solution is not unique; we will come 
across some of these cases in the exercises. 

Course outline This course is located at the intersection of mathematics and physics, so 
one of the tasks is to establish a dictionary between the mathematics and physics commu¬ 
nity. Both communities have benefitted from each other tremendously over the course of 
history: physicists would often generate new problems for mathematicians while mathe¬ 
maticians build and refine new tools to analyze problems from physics. 

Concretely, the course is structured so as to show the interplay between different fields 
of mathematics (e. g. functional analysis, harmonic analysis and the theory of Schrodinger 
operators) as well as to consider different aspects of solving differential equations. 

However, the course is not meant to be a comprehensive introduction to any of these 
fields in particular, but just give an overview, elucidate some of the connections and whet 
the appetite for more. 


^For linear-homogeneous differential equations, if the zero function is in the domain, it is automatically a 
solution. Hence, the zero function is often referred to as “trivial solution”. 
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Chapter 2 

Ordinary differential equations 


An ordinary differential equation - or ODE for short - is an equation of the form 

±x = x = F[x), x(0) = xo e [/c R", (2.0.1) 

defined in terms of a vector field F = (Fj,..., F„): 1/ c R" —> R". its solutions are curves 
X = (xi,...,Xn) : (—r,+r) —> U in R" for times up to 0 < F < +oo. You can also 
consider ODEs on other spaces, e. g. C‘^. 

The solution to an ODE is a flow ^ : [—T,+T) x U —> U, i. e. the map which satisfies 

where x(t) is the solution of (2.0.1) . Locally, $ is a “group representation”^ 

(i) 4>o = id[; 

(ii) = ^ti+t2 as long as tj.ta, tj + t 2 ^[-T,+T) 

(iii) 4>t o = id[; for all t e [—T,+T) 

The ODE (2.0.1) uniquely determines the flow 4> and vice versa: on the one hand, we can 
define the flow map 4>t(xo) = x(t) for each initial condition Xq and t as the solution of 
(2.0.1) . Property (i) follows from x(0) = Xq. The other two properties follow from the 
explicit construction of the solution later on in the proof of Theorem 2.2.3 

On the other hand, assume we are given a flow 4> with properties (i)-(iii) above which 
is differentiable in t. Then we can recover the vector field F at point Xg by taking the time 

^If r = +00, it really is a group representation of (R, +). 
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2 Ordinary differential equations 


derivative of the flow, 

|(^5Uo)-^oUo)) 

o^O ^ 

= :F(4>o(xo)) =F(xo). 

Now by property (ii), the above also holds at any later point in time: 
d , ^ 

— 4>t(xo) = hm - ($t+5(xo) - 4>t(xo)} 
at o^o ^ 

lim ^ (^>5 ($tM) - $t(xo)] 

= limj($5 (y)-y) 

= F(y) = F($,(xo)) 

In other words, we have just shown 
d 

— $t(xo) = F($t(xo)}, $o(xo) = Xo, 

d 

—$f=Fo$t, $0 = id. 

dt 

2013.09.10 Clearly, there are three immediate questions: 

(1) When does the flow exist? 

(2) Is it unique? 

(3) How large can we make T? 

2.1 Linear ODEs 

The simplest ODEs are linear where the vector field 

F(x) = Hx 

is defined in terms of a n x n matrix H = (Flj^) j.<„ e Mat^Cn). The flow 4>t = e^^ is 
given in terms of the matrix exponential 

00 J.k 

e'» = exp(tH):=^-H\ (2.1.1) 
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2 

This series converges in the vector space Matc(n) = C" : if we choose 


||H|| := max 

l<i.k<n I ^ 


as norm^, we can see that the sequence of partial sums Sjv := ki ^ Cauchy 
sequence, 



iC^oo 

-> 0 . 


Using the completeness of C"^ with respect to the the norm H-H (which follows from the 
completeness of C with respect to the absolute value), we now deduce that Sf^ as 

N oo. Moreover, we also obtain the norm bound ||e'^^|| < 

From linear algebra we know that any matrix H is similar to a matrix of the form 


J = U~^HU = 


0 0 A 

0 0 
V 0 0 


where the are the eigenvalues of H and the 


f X, 1 0 oA 




VO 


•• ^ 1 


—: Aid^r; +iVr e Mat^Cr■) 


0 XJ 


are the -dimensional Jordan blocks associated to A^. Now the exponential 


e 


tH 



u-1 


^We could have picked any other norm on C"^ 
lent. 


since all norms on finite-dimensional vector spaces are equiva- 
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2 Ordinary differential equations 


is also similar to the exponential of tJ. Moreover, one can see that 

0 0 A 


e^" = 


0 

V 0 


0 

0 J 


and the matrix exponential of a Jordan block can be computed explicitly (cf. problem 6 
on Sheet 02), 


_ gtA 


fl t f ... 
0 1 t f 

0 0 

0 0 0 1 
0 0 0 0 
Vo 0 0 0 


In case H is similar to a diagonal matrix, 

f 0 0 A 


(r-l)! 

(r-2)! 


A 


1 J 


H = U 


U 


-1 


this formula simplifies to 


Vo 0 A J 
0 0 A 


e'" = U 


0 
V 0 


u 


-1 


0 e^^"y 


Example (Free Schrodinger equation) The solution to 


i^t/.(t)=lfc2^(t). 


^PW = i,o^C, 


is = e ^^2^ ipQ, 


Inhomogeneous linear ODEs In case the linear ODE is inhomogeneous, 
x(t)=Ax(t) + /(t), x(0) = Xo, 


it turns out we can still find a closed solution: 


x(t) = e^^Xo + 


dse 


,(t-s)A 


/( 5 ) 


0 


( 2 . 1 . 2 ) 
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2.1 Linear DDEs 


Higher-order ODEs 

d" 


—x = F(x) + /(t), 


j = 0,...,n-l, (2.1.3) 

can be expressed as a system of jirst-order ODEs. If x e R for simplicity, we can write 


yi 

yj 

yn 


= X 

= T^ 

dt 

d 

= 

d" 


= ^Jt = fU) + /(0 = f(yi) + /(0- 


Thus, assuming the vector field F(x) = Ax, A e R, is linear and the dimension of the 
underlying space 1, we obtain the first-order linear equation 




d 

dt 


\ynj 


f 0 


0 

^A 0 


0 


oA 


0 

1 

o; 


fyi\ 


-f 


yynj 


f o\ 


0 

V/(o; 


(2.1.4) 


=:H 


and the solution is given by (2.1.2) with initial condition Jo = ( 
Then the solution to (2.1.3) is then just the first component yi(t). 


q;( 0) „(1) 


v(n- 




Example (Newton’s equation of motion for a free particle) The dynamics of a free par¬ 
ticle of mass m starting at Xg with initial momentum po is 


X = 0, 

This leads to the first-order equation 

Ti 


^(0 ) = Xq, x(0) = 


Po 


dt \y2 


0 1 
0 0 


= :Hy. 


The matrix H is nilpotent, = 0, and thus the exponential series 

oo fk Fle 

e^" = y—H'' = id-l-tH-l-0=L , 

fci I 0 1 

k=0 V 

terminates after finitely many terms. Hence, the solution is 

Jt(0 = yi(0 = Jto + f r- 
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2 Ordinary differential equations 


2.2 Existence and uniqueness of solutions 

Now we turn back to the non-linear case. One important example of non-linear ODEs are 
Hamilton’s equations of motion (1.0.1) which describe the motion of a classical particle; 
this will be the content of Chapter 3. 

The standard existence and uniqueness result is the Picard-Lindelof theorem. The cru¬ 
cial idea is to approximate the solution to the ODE and to improve this approximation 
iteratively. To have a notion of convergence of solutions, we need to introduce the idea of 


Definition 2.2.1 (Metric space) Let X be a set. A mapping d : X x X —> [0,-l-oo) with 
properties 

(i) d(x,y) = 0 exactly if x = y (definiteness), 

(ii) d(x,y) = d[y,x) (symmetry), and 

(in) d(x,z) < d(x,y) + d(y,z) (triangle inequality), 

for all x,y,z & X is called metric. We refer to (X,d) as metric space (often only denoted as 
X). A metric space (X,d) is called complete if all Cauchy sequences (xj-^j) (with respect to 
the metric) converge to some x e W. 

A metric gives a notion of distance - and thus a notion of convergence, continuity and 
open sets (a topology): quite naturally, one considers the topology generated by open balls 
defined in terms of d. There are more general ways to study convergence and alternative 
topologies (e. g. Frechet topologies or weak topologies) can be both useful and necessary. 


Example (i) Let X be a set and define 


d(x,y) 


1 

0 x = y 


It is easy to see d satisfies the axioms of a metric and X is complete with respect to 
d. This particular choice leads to the discrete topology. 

(ii) Let W = C([a, b], U), 1/ c ]R" be the space of continuous functions on an interval. 
Then one naturally considers the metric 

doo^f,g)~ sup |/(x)-g(x)|= max |/(x)-g(x)| 

xe[a,fa] 

with respect to which C([a, b],U) is complete. 
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2.2 Existence and uniqueness of solutions 


One important result we need concerns the existence of so-called fixed points in complete 
metric spaces: 

Theorem 2.2.2 (Banach’s fixed point theorem) Let [X,d) be a complete metric space 
and P a contraction, i. e. a map for which there exists C e [0,1) so that for all x,y ^ X 

d(P(x),P(y)) <Cd(x,y) (2.2.1) 

holds. Then there exists a unique fixed point x^ = P(x^,) so that for any Xg e X, the sequence 

= { P°--°P (xo)}„^^ 
n times 


converges to x^ e X. 

Proof Let us define x^ := P^Cxq) for brevity. To show existence of a fixed point, we will 
prove that {x^lngp^ is a Cauchy sequence. First of all, the distance between neighboring 
sequence elements goes to 0, 

d(x„+i,x„) = d(P(x„),P(x„_i)) < Cd(x„,x„_i) 

<C''d(xi,Xo). (2.2.2) 

Without loss of generality, we shall assume that m<n. Then, we use the triangle inequal¬ 
ity to estimate the distance between x^ and x^ by distances to neighbors, 

n 

x^) < d(x„, x„_i) -I- d(x„_i, x„_2) -F ... -F d(x„,+i, x,„) = ^ d[Xj, Xj_i). 

}=m+l 

Hence, we can plug in the estimate on the distance between neighbors, (2.2.2) , and sum 
over j. 


d[x^,x„)< ^ C-' ^d(xi,Xo) = C™ ^C-'d(xi,Xo) 

j=m+l j=0 

00 ^rn 

< C'"d(xi,Xo) = —— d(xi,Xo). 

;=0 ^ ^ 

If we choose m large enough, we can make the right-hand side as small as we want and 
thus, {x„}ngf^ is a Cauchy sequence. By assumption the space [X,d) is complete, and 
thus, there exists x^, so that 


lim Xj, = lim P^Cxq) = x*. 

n^oo n^oo 
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2013.09.12 


It remains to show that this fixed point is unique. Pick another initial point x' with limit 
x' = lim„_oo P"(xq); this limit exists by the arguments above. Both fixed points satisfy 
Px^, = x^, and Px' = x', because 

P(x^,) = P( lim P^Cxq)) = lim P^+^Xq) = x^, 

n—*C)0 n—*C)0 

holds (and analogously for x'). Using the contractivity property (2.2.1) , we then estimate 
the distance between x,^ and x' by 

d(x*, x') = d (P(x J, P(x')) < C d(x*, x'). □ 

Since 0 < C < 1, the above equation can only hold if d(x^„ xQ = 0 which implies x^, = x'. 
Hence, the fixed point is also unique. 

To ensure the existence of the flow, we need to impose conditions on the vector field. One 
common choice is to require F to be Lipschitz, meaning there exists a constant L > 0 so 
that 


|p(x)-P(xO| <l|x-x'| (2.2.3) 

holds for all x and x' in some open neighborhood of the initial point Xq = x(0). The 
Lipschitz condition has two implications: first of all, it states that the vector field grows 
at most linearly. Secondly, if the vector field is continuously differentiable, L is also a 
bound on the norm of the differential sup^g^; ||DP(x)||. However, not all vector fields 
which are Lipschitz need to be continuously differentiable, (2.2.3) is in fact weaker than 
requiring that sup^g^;^ ||DP(x)|| is bounded. For instance, the vector field P(x) = — |x| in 
one dimension is Lipschitz on all of R, but not differentiable at x = 0. 

So if the vector field is locally Lipschitz, the flow exists at least for some time interval: 

Theorem 2.2.3 (Picard-Lindelof) Let Fbea continuous vector field, F &C{U, R"), 1/ C R" 
open, which defines a system of dijferential equations, 

X = P(x). (2.2.4) 

Assume for a certain initial condition Xq e U there exists a closed ball 

Bp[xo):={x^R^\\x-Xo\<p}^U, p > 0, 

such that F is Lipschitz on Bp(xo) with Lipschitz constant L (cf equation (2.2.3) j. Then 
the initial value problem, equation (2.2.4) with x(0) = Xg, has a unique solution t x[t) 
for times |t| < T := min(p/v„a„ i/zr) where the maximal velocity is defined as v^^^ := 
sup^eBpU„)|f'U)l- 
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2.2 Existence and uniqueness of solutions 


Proof The proof consists of three steps: 

Step 1: We can rewrite the initial value problem equation (2.2.4) with x(0) = Xq as 


x(t) = Xo + 


t 

ds F (x(s)) 


Jo 


This equation can be solved iteratively: we define X(o)(t) := Xq and the n + 1th iteration 
X(n+ij(t) := (P(x(n)))(t) in terms of the so-called Picard map 


c f 


(Kjt(n)))(0 :=^o + 

Jo 


dsF(x(„)(t)). 


Step 2: We will determine T > 0 small enough so that P : X —> A” is a contraction on 
the space of trajectories which start at Xq, 


A- := {y e C([-T, +P],Bp(xo)) | y(0) = Xq}. 

First, we note that A” is a complete metric space if we use 

d(y,z):= sup |y(t)-z(t)| y,z^X 

te[-r,+r] 


to measure distances between trajectories. Hence, Banach’s fixed point theorem 2.2.2 
applies, and once we show that P is a contraction, we know that X(-„) converges to the 
(unique) fixed point x = P(x) e A"; this fixed point, in turn, is the solution to the ODE 
(2.2.4) with x(0) = Xg. 

To ensure P is a contraction with C = 1/2, we propose T < Then the Lipschitz 
property implies that for any y, z e A’, we have 


d(P(y),P(z)) = sup 

f€[-T’,+ T] 


r f 


ds [(P(y))(s) - (P(z))(s)] 


<TL sup |y(t)-z(t)| < ^Ld(y,z)= |d(y,z). 
te[-r,-i-r] 


Step 3: We need to ensure that the trajectory does not leave the ball Bp(xo) for all 
t e [—T, +T ]: For any y & X,we have 


|ny)-xo 


^ t 

dsP(y(s)) 


Jo 


— ^ ^max — ^ ^max P • 


Hence, as long as T < min{V2i,’’niax/p} trajectories exist and do not leave U. This con¬ 
cludes the proof. □ 


2013.09.17 


13 














2 Ordinary differential equations 


This existence result for a single initial condition implies immediately that 

Corollary 2.2.4 Assume we are in the setting of Theorem 2.2.3. Then for any Xy there 

exists an open neighborhood V of Xy so that the flow exists as a map 4>;[—T,+ r]xV —> U. 

Proof Pick any Xy e U. Then according to the Picard-Lindelof Theorem 2.2.3 there exists 
an open ball Bp(xy) c u around so that the trajectory t x(t) with x(0) = x^/ exists 
for all times t e [—T,+T] where T = min{p/v„j^V2i} and = sup^g^ 

Now consider initial conditions Xq e Bp/fxy): since the vector field satisfies the same 
Lipschitz condition as before, the arguments from Step 3 of the proof of the Picard- 
Lindelof Theorem tell us that for times t e [—t/2,+V2], the particles will not leave 
Bp(xy). Put more concretely: the maximum velocity of the particle (which is the maxi¬ 
mum of the vector field) dictates how far it can go. So even if we start at the border of the 
ball with radius s/ 2 , for short enough times (e. g. V2) we can make sure it never reaches 
the boundary of Bp(xy). 

This means the flow map $ : [—1/2, -I-V2] x ^ Bp[xy) exists. Note that since 

Bpffxy) is contained in Bp(xy) c u, we know that the smaller ball is also a subset of U.^ 

Another important fact is that the flow 4> inherits the smoothness of the vector field which 
generates it. 

Theorem 2.2.5 Assume the vector field F is k times continuously differentiable, F e C^[U, R"), 
U C R". Then the flow 4> associated to (2.2.4) is also k times continuously differentiable, 
i. e. 4> e T, -l-T ] xV,U) where V <zU is suitable. 

Proof We refer to Chapter 3, Section 7.3 in [Arn06]. □ 

2.2.1 Interlude: the Gronwall lemma 

To show global uniqueness of the flow, we need to make use of the exceedingly useful 
Gronwall lemma. It is probably the simplest “differential inequality”. 

Lemma 2.2.6 (Gronwall) Let u be differentiable on the interior of I = [a, b] or / = 
[a, -l-oo), and satisfy the differential inequality 

u[t) < I3[t)u[t) 

where p is a real-valued, continuous function on I. Then 

u(t) < (2.2.5) 


holds for all t & I. 
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2.2 Existence and uniqueness of solutions 


Proof Define the function 


Then v(t) = ^(t) v(t) and v(a) = 1 hold, and we can use the assumption on u to estimate 


_ u(t)v(t)-u(t)T>(t) 
dt v(t) v(t)2 

_ u(t)v(t)-u(t)/3(t)v(t) 
v(t)2 

^ /3(t)u(t)v(t)-^(t)u(t)v(t) 
v[tf 


Hence, equation (2.2.5) follows from the mean value theorem. 


v(t) “ v(a) 


u[a). 


□ 


One important application is to relate bootstrap information on the vector fields to infor¬ 
mation on the flow itself. For instance, if the vector fields of two ODEs are close, then also 
their flows remain close - but only for logarithmic times at best. After that, the flows will 
usually diverge exponentially. 

If one applies this reasoning to the same ODE for different initial conditions, then one 
observes the same effect: no matter how close initial conditions are picked, usually, the 
trajectories will diverge exponentially. This fact gives rise to chaos. 


Proposition 2.2.7 Suppose the vector field = Fq + eFi satisfies the Lipschitz condi¬ 
tion (2.2.3) for some open set U with Lipschitz constant L > 0, consisting of a leading- 
order term Fq that is also Lipschitz with constant L, and a small, bounded perturbation e Fj, 
i. e. 0 < s ^ 1 and C := sup^^g^ |Fi| < oo. 

Then the flows and associated to x® = Fg and x*^°^ = Fq exist for the same times 
t e [—F,-FT], and the two are 0[e') close in the following sense: there exists an open 
neighborhood V so that 

(j 

sup|$^(x) - $°(x)| <£- (e^l'l - 1) (2.2.6) 

xev L 


holds. 

An important observation is that this result is in some sense optimal: while there are a 
few specific ODEs (in particular linear ones) for which estimates analogous to (2.2.6) hold 
for longer times, in general the Proposition really mirrors what happens: solutions will 
diverge exponentially. 
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2 Ordinary differential equations 


In physics, the time scale at which chaos sets in, 0(|lne|), is known as Ehrenfest time 
scale; classical chaos is the reason why semiclassics (approximating the linear quantum 
evolution by non-linear classical dynamics) is limited to the Ehrenfest time scale. 


Proof First of all, due to Corollary 2.2.4, there exist T > 0 and an open neighborhood V 
so that for all initial conditions Xq e V, the trajectories do not leave U. 

Let X® and x*^°^ be the trajectories which solve x'^ = and x*^°^ = Fq for the initial 
condition x®(0) = Xg = x*^°^(0). Moreover, let us define the difference vector X(t) := 
x®(t) — x^°^(t) so that u(t) = |X(t)|. Using |x| — |y| < |^ —y| for vectors x,y e R" 
(which follows from the triangle inequality), we obtain 


d 



— 


lim 

5-.0 


X(t + 5)-X(t) 


> lim 

5->0 


5 

|X(t + 5)| 


= lim 

5—0 


|x(t + 5)-X(t)| 


| 5 | 


■ 1^(01 


| 5 | 


d 


(2.2.7) 


Hence, we can estimate the derivative of u(t) from above by 


d 

dt 


m< 

^(x^CO-x^Ct)) 

= 

F,(x^(t))-Fg(xm(t)) 

< 

Fo(x^(t))-Fg(x(°)(t)) 

-|-e|Fi (x''(t))| 


< L |x®(t) — x^°^(t)| + e C = L u(t) + e C. 


The above inequality is in fact equivalent to 


d 

dt 


(e-^^u(t)) 


= e"^' (!i(t) -Luit))<sC e"^^ 


Now we integrate left- and right-hand side: since we assume both trajectories to start 
at the same point Xg, we have u(0) = Ix'^CO) — x^°^(0)| = |xo — Xg| = 0. Moreover, the 
integrands on both sides are non-negative functions, so that for t > 0 we obtain 


r f 


ds - (e-^^ u(s)) = [e-^* u(s)] = e""^ u(t) 


r f 


< 


ds c C e 


—Ls 




A similar result is obtained for t < 0. Hence, we get 

u(t)<c^(e^l^l-l). 


Since the Lipschitz constant L and C were independent of the initial point Xg, this estimate 
holds for all Xg e U, and we have shown equation (2.2.6) . □ 
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2.2 Existence and uniqueness of solutions 


Note that the estimate (2.2.6) also holds globally (meaning for all times t e R and initial 
conditions Xq e R") if the vector fields are globally Lipschitz, although we need the global 
existence and uniqueness theorem. Corollary 2.2.8, from the next subsection. 

2.2.2 Conclusion: global existence and uniqueness 

The Gronwall lemma is necessary to prove the uniqueness of the solutions in case the 
vector field is globally Lipschitz. 

Corollary 2.2.8 If the vector field F satisfies the Lipschitz condition globally, i. e. there exists 
L > 0 such that 


|f(x)-F(xO| <l|x-x‘ 


holds for all x,x' e R", then t —>■ 4>t(xo) exists globally for uZZ t e R and Xq e R". 

Proof For every Xq e R", we can solve the initial value problem at least for |t| < t/it. 
Since we do not require the particle to remain in a neighborhood of the initial point as in 
Theorem 2.2.4, the other condition on T is void. 

Using we obtain a global trajectory x : R —> R" for all times. However, 

we potentially lose uniqueness of the trajectory. So assume x is another trajectory. Then 
we define u(t) := |x(t) — x(t)| and use the Lipschitz property (2.2.3) to deduce 



^ |x(t) - x(t)| = |f( x(t)) - F(x(t)) I 
< L |x(t)-x(t)|. 


Hence, the Gronwall lemma applies with a = 0 and 


u(0) = |x(0) - x(0)| = |xo - Xo| = 0, 


and we obtain the estimate 


0 < u(t) = |x(t) - x(t)| < u(0)e^' = 0, 


□ 


i. e. the two trajectories coincide and the solution is unique. 


2013.09.19 
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2 Ordinary differential equations 


2.3 Stability analysis 

What if an ODE is not analytically solvable, beyond using numerics, what information 
can we extract? A simple way to learn something about the qualityative behavior is to 
linearize the vector field near fixed points, i. e. those Xq e R" for which 

F(xo) = 0. 

Writing the solution x(t) = Xq + 5y(t) where x(0) = Xq and Taylor expanding the vector 
field, we obtain another linear ODE involving the differential DF[xq) = (5^.Fjt(^o))i<j k<d’ 

^^(0 = 5 = Pixo + 5y(t)) 

= f Uo) + 5 DF(xo)y(t) + 0(5^) = 5 DF(xo)y(t) 

^ j:yM = DF[xo)y[t). 
at 

The latter can be solved explicitly, namely 

Now the stability of the solutions near fixed points is determined by the eigenvalues 
ofDF(xo). 

Definition 2.3.1 (Stability of fixed points) We call an ODE near afixedpoint Xq 

(i) stable ifRePij < 0 holds for all eigenvalues Pij of DF[xq), 

(ii) marginally stable (or Liapunov stable) ReA^ < 0 holds for all j and Re = 0/or at 
least one j, and 

(in) unstable otherwise. 

There is also another characterization of fixed points: 

Definition 2.3.2 An ODE near afixedpoint Xq is called 

(i) elliptic if Re A^ = 0 for all j = and 

(ii) hyperbolic i/Im A^ = 0/or all j = 1,..., JV and Re A^ > 0/or some j. 

Example (Lorenz equation) The Rayleigh-Bernard equation describes the behavior of a 
liquid in between two warm plates of different temperatures: 

f Cr(X2-Xi) ^ 

^2 = -^i^3 + rxi-X2 =:F(x) 
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2.3 Stability analysis 


Here, the variables and Xj are temperatures, X3 is a speed, the constant cr is a volume 
ratio, b is the so-called Prandtl number and r > 0 is a control parameter. Typical values 
for the parameters are cr ~ 10 and b ^ 8/5. 

This vector field may have several fixed points (depending on the choice of the other 
parameters), but x = 0 is always a fixed point independently of the values of cr, b and r. 

Let us analyze the stability of the ODE near the fixed point at x = 0: 


DF(0) = 


^ — G 

-bcr 

0 ^ 

r 

-1 

0 

V 0 

0 

-b] 


The block structure of the matrix yields that one of the eigenvalues is — b. 
The other two eigenvalues can be inferred from finding the zeros of 


XW = det 


-a A 

X +1J 


(A-I- cr)(A-I- 1) - (-1)^ rcr 


= A^ -I- (cr -b 1) A -I- (1 - r) cr. 


namely 


CJ + 1 1 j --- 

A± =- — ± - V (cr -b 1)2 - (1 - r) CT. 

If r > 1, then (cr -b 1)^ — (1 — r)cr > (cr -b l)^, and thus A_ < 0 < A+ as long as ct > r is 
large enough. Hence, the fixed point at x = 0 is unstable and h5rperbolic. 

For the other case, 0 < r < 1, both eigenvalues are negative, and thus A± < 0, the fixed 
point is stable (but neither elliptic nor hyperbolic). 


2013.09.24 
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Chapter 3 

Classical mechanics 


This section serves to give a short introduction to the hamiltonian point of view of classical 
mechanics and some of its beautiful mathematical structures. For simplicity, we only treat 
classical mechanics of a spinless point particle moving in R", for the more general theory, 
we refer to [MR99; Arn97]. A more thorough introduction to this topic would easily take 
up a whole lecture, so we content ourselves with introducing the key precepts and apply 
our knowledge from Section 2. 

We will only treat hamiltonian mechanics here: the dynamics is generated by the so- 
called hamilton function H : —> R which describes the energy of the system for a 

given configuration. Here, R^" is also known as phase space. Since only energy dijferences 
are measurable, the hamiltonian H' := H + Eq, Eq e R, generates the same dynamics as H. 
This is obvious from the Hamilton’s equations of motion, 

q(t) = +V^H(q(t),p(t)), (3.0.1a) 

Kt) = -V,H(q(t),p(t)), (3.0.1b) 

which can be rewritten in matrix notation as 

0 -id«Arq(t)A 

[p[t)J ■ 1^+idr 0 J [p[t)J 

= (^v’h) =:X„(q(t),p(t)). 

The matrix J appearing on the left-hand side is often called symplectic form and leads to 
a geometric point of view of classical mechanics. For fixed initial condition (qo,Po) ^ 
at time to = 0, i. e. initial position and momentum, the Hamilton’s flow 

$ : R X R^" —> r2" (3.0.3) 
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3 Classical mechanics 


maps (qo>Po) onto the trajectory which solves the Hamilton’s equations of motion, 

^Mo,Po'} = (q(0,p(0), (q(o),p(o)) = (qo,Po)- 

If the flow exists for all t e R, it has the following nice properties: for all t,t' e R and 
(qo,Po) s R^", we have 

(i) 4>f(4>t/(qo,Po)) =^t+£'(qo,Po). 

(ii) 4>o(qo>Po) = (qo.Po)>and 

(hi) 4>t(4>-t(qo,Po)) =^t-t(qo,Po) = (qo-Po)- 

Mathematically, this means 4> is a group action of R (with respect to time translations) on 
phase space R^". This is a fancy way of saying: 

(i) If we first evolve for time t and then for time t', this is the same as evolving for time 
t + t'. 

(ii) If we do not evolve at all in time, nothing changes. 

(iii) The system can be evolved forwards or backwards in time. 

The above results immediately apply to the Hamilton’s equations of motion: 

Corollary 3.0.3 Let H[q,p') = ^P^ + ^C'j) hamiltonian which generates the dynam¬ 

ics according to equation (3.0.2) such that V satisfies a global Lipschitz condition 

|v,V(q) - V,nqO| < i |q - q'l Vq,q' e R". 

Then the hamiltonian flow 4> exists for all t e R. 

Obviously, if V is only locally Lipschitz, we have local existence of the flow. 

Remark 3.0.4 Note that if all second-order derivatives of V are bounded, then the hamil¬ 
tonian vector field Xfj is Lipschitz. 

3.1 The trinity of physicai theories 

It turns out to be useful to take a step back and analyze the generic structure of most 
physical theories. They usually consist of three ingredients: 

(i) A notion of state which encodes the configuration of the system, 

(ii) a notion of observable which predicts the outcome of measurements and 

(iii) a dynamical equation which governs how the physical system evolves. 
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3.1 The trinity of physical theories 


3.1.1 States 

The classical particle always moves in phase space ^ x Rj] of positions and mo¬ 
menta. Pure states in classical mechanics are simply points in phase space: a point par¬ 
ticle’s state at time t is characterized by its position q(t) and its momentum p(t). More 
generally, one can consider distributions of initial conditions which are relevant in statisti¬ 
cal mechanics, for instance. 


Definition 3.1.1 (Classical states) A classical state is a probability measure p on phase 
space, that is a positive Borel measure^ which is normed to 1, 

piU) > 0 for all Borel sets U C R^" 


/t(R2") = 


dp = 1. 


Jr2" 


Pure states are point measures, i. e. if(.qo>Po) ^ then the associated pure state is given 
by := = 5(- - iqo,Po)).^ 


3.1.2 Observables 

Observables / such as position, momentum, angular momentum and energy describe the 
outcome of measurements. 

Definition 3.1.2 (Classical observables) Classical observables f e C^CR^", R) are smooth 
functions on R^" with values in R. 

Of course, there are cases when observables are functions which are not smooth on all or 
R^", e. g. the Hamiltonian which describes Newtonian gravity in three dimensions, 

Hiq,p)= 

2m |q| 

has a singularity at q = 0, but H e C°° (R^" \ {0}, R). 

Intimately linked is the concept of 

Definition 3.1.3 (Spectrum of an observable) The spectrum of a classical observables, i. e. the 
set of possible outcomes of measurements, is given by 


spec/ :=/(R^") = im/. 

^Unfortunately we do not have time to define Borel sets and Borel measures in this context. We refer the 
interested reader to chapter 1 of [LLOl]. Essentially, a Borel measure assigns a “volume” to “nice” sets, 
i. e. Borel sets. 

^Here, 5 is the Dirac distribution which we will consider in detail in Chapter 7. 
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3 Classical mechanics 


If we are given a classical state fi, then the expectation value of an observable / for the 
distribution of initial conditions p is given by 


E^.(/) := 




d/r(q,p)/(q,p). 


3.1.3 Dynamics: Schrodinger vs. Heisenberg picture 

There are two equivalent ways prescribe dynamics: either we evolve states in time and 
keep observables fixed or we keep states fixed in time and evolve observables. In quantum 
mechanics, these two points of view are known as the Schrodinger and Heisenberg picture 
(after the famous physicists of the same name). Usually, there are situations when one 
point of view is more convenient than the other. 

In both cases the crucial ingredient in the dynamical equation is the energy observable 
H(q,p) also known as Hamilton function, or Hamiltonian for short. The prototypical form 
for a non-relativistic particle of mass m > 0 subjected to a potential V is 

H(,q,p)= ;^P^ +V(q). 

2m 

The first term, ^p^, is also known as kinetic energy while V is the potential. 

We have juxtaposed the Schrodinger and Heisenberg point of view in Table 3.1.1: in 
the Schrodinger picture, the states 

p(t):=po$_t (3.1.1) 

are evolved backwards in time while observables / remain constant. (That may seem 
unintuitive at first, but we ask the skeptic to continue reading until the end of Section 3.4.) 
Conversely, in the Heisenberg picture, observables 

/(t):=/o4., (3.1.2) 

evolve forwards in time whereas states p remain fixed. In both cases, the dynamical 
equations can be written in terms of the so-called Poisson bracket 

n 

{f,g} ■= %g - V 

j=i 

These equations turn out to be equivalent to proposing Hamilton’s equations of mo¬ 
tion (3.0.2) (cf Proposition 3.3.1). 

Example For the special observables position q^ and momentum pj equation (3.3.1) re¬ 
duces to the components of Hamilton’s equations of motion (3.0.2) , 

q; = = +dp^H 

p, = {H,Pj} = -S,.H. 
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3.2 Conservation of probability and the Liouville theorem 



Schrodinger picture 

Heisenberg picture 

States 

p[t) = po 

/i 

Observables 

/ 

/(0 = /o4>t 

Dynamical equation 

^/x(t) = -{H,/x(t)} 

A/(t) = {H,/(t)} 


Table 3.1.1: Comparison between Schrbdinger and Heisenberg picture. 


3.2 Conservation of probability and the Liouville theorem 


In the Schrbdinger picture, states are time-dependent while observables remain fixed in 
time. If p is a state, we develop it backwards in time, p[t) = /xo4>_f, using the hamiltonian 
flow 4> associated to (3.0.2) . 

A priori, it is not obvious that p[t) is still a classical state in the sense of Definition 3.1.1. 
The first requirement, [p[t))[U) > 0, is still satisfied since is again a subset of 

]g2n3 jg obvious is whether p[t) is still normed, i. e. whether 


(/r(t))(R2-) 


d/r(t) = 1? 


Proposition 3.2.1 Let p be a state on phase space and the flow generated by a 
hamiltonian H e which we assume to exist for |t| < T where 0 < T < oo is 

suitable. Then p[t) is again a state. 

The proof of this relies on a very deep result of classical mechanics, the so-called 

Theorem 3.2.2 (Liouville) The hamiltonian vector field is divergence free, i. e. the hamilto¬ 
nian flow preserves volume in phase space of bounded subsets V o/R^" with smooth boundary 
8V. In particular, the functional determinant of the flow is constant and equal to 

det(D$t(q,p)) = 1 


for all t e R and (q,p) e R^". 

Remark 3.2.3 We will need a fact from the theory of dynamical systems: if is the flow 
associated to a differential equation x = F(x) with F e C^(R",R‘^), then 

^D$,(x) = DF(^fx)) D^,[xl = idR„, 

^The fact that 4>_f(tf) is again Borel measurable follows from the continuity of ■i’t in t. 
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3 Classical mechanics 



Figure 3.2.1: Phase space volume is preserved under the hamiltonian flow. 


holds for the differential of the flow. As a consequence, one can prove 

— (detD4>t(x)) = tr (DF(4>f(x))) det (D4>t(x)) 

= divF (4>t(x)) det (D4>t(x)), 

and det (D4>f) = 1. There are more elegant, general and geometric proofs of this fact, 

but they are beyond the scope of our short introduction. 

Proof (Theorem 3.2.2) Let H be the hamiltonian which generates the flow Let us 
denote the hamiltonian vector field by 




Then a direct calculation yields 

n 

divX„ = i+W + = 0 


;=i 


and the hamiltonian vector field is divergence free. This implies the hamiltonian flow 
preserves volumes in phase space: let V c ^ bounded region in phase space (a 

Borel subset) with smooth boundary. Then for all —T < t < T for which the flow exists, 
we have 

dq dp 

. 4.,(V) 

' 

dx' dp' det (D4>t(q',p0). 

Jv 




d 

dt 


26 








3.2 Conservation of probability and the Liouville theorem 


Since V is bounded, we can bound det (D4>f) and its time derivative uniformly. Thus, we 
can interchange integration and differentiation and apply Remark 3.2.3, 

^ dx'dp'det(D$t(q',pO) = dx'dp'^det (D$f(q',pO) 

JV JV 

r 

dx'dp' divXH(4>t(q',pO) det (D4>f(q',p0) 

V '- 


=0 


= 0 . 


Hence ^Vol (V) = 0 and the hamiltonian flow conserves phase space volume. The func¬ 
tional determinant of the flow is constant as the time derivative vanishes, 

^det (D4>t(q',p0) =0, 

and equal to 1, 

det (D4>t(q',p0)|f=o = detidgi™ = 1. 

This concludes the proof. □ 


With a different proof relying on alternating multilinear forms, the requirements on V can 
be lifted, see e. g. [Arn97, Theorem on pp. 204-207]. 

Proof (Proposition 3.2.1) Since 4>t is continuous, it is also measurable. Thus /t(t) = 
p o is also a Borel measure on (4>_f exists by assumption on t). In fact, 4>t is 
a diffeomorphism on phase space. Liouville’s theorem 3.2.2 not only ensures that the 
measure p[t) remains positive, but also that it is normalized to 1: let 17 c ^ Borel 

set. Then we conclude 




d(K0)(q,p) 


Ju 

r 


d/t(4>_t(q,p)) 

Ju 

r 


d/t(q,p)det(D4>t(q,p)) = 


d/i(q,p) > 0 






where we have used the positivity of p and the fact that is again a Borel set by 

continuity of 4>_t. If we set U = R^" and use the fact that the flow is a diffeomorphism, 
we see that R^" is mapped onto itself, 4>_f(R^") = R^", and the normalization of p leads 
to 


(/x(t))(R2'-) 


d/i(q,p) 


d/t(q,p) 


1 


This concludes the proof. 
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3 Classical mechanics 


3.3 Equation of motion for observabies and Poisson aigebras 

Viewed in the Heisenberg picture, observables move in time, /(t) = / o while states 
are remain fixed. Seeing as is invertible, it maps onto itself, and thus the spectrum 
of the observable does not change in time, 

spec/(t) = spec/. 

For many applications and arguments, it will be helpful to find a dynamical equation for 
/(t) directly: 

Proposition 3.3.1 Let f e C“(R^",R) be an observable and 4> the hamiltonian flow which 
solves the equations of motion (3.0.2) associated to a hamiltonian H e C“(R^'*,R) which we 
assume to exist globally in time for all (qo>Po) ^ 1®^"- Then 

^/(t) = {H,/(t)} (3.3.1) 

holds where {/, g} := Xij=i(^p,/ ~ ^pjS) so-called Poisson bracket. 

Proof Theorem 2.2.5 implies the smoothness of the flow from the smoothness of the 
hamiltonian. This means /(t) e C“(R^",R) is again a classical observable. By assump¬ 
tion, all initial conditions lead to trajectories that exist globally in time.'^ For (qo>Po)> we 
compute the time derivative of /(t) to be 

(qo,Po) = ^/(q(0,p(0) 

n 

= ° 4>t(qo,Po)qj(0 + Spjf ° ^t(qo,Po)Pj(o) 

;=i 

n 

= ^(^g/ ° 4>t(qo,Po) Sp.H o 4>t(qo,Po)+ 

;=i 

+ dp/ o 4>t(qo>Po) o 4>t(qo,Po))) 

= {H(t),/(t)}(qo,Po)- 

In the step marked with *, we have inserted the Hamilton’s equations of motion. Com¬ 
pared to equation (3.3.1) , we have H instead of H(t) as argument in the Poisson bracket. 
However, by setting / (t) = H(t)in the above equation, we see that energy is a conserved 
quantity, 

^H(t)={H(t),H(t)} = 0. 

slightly more sophisticated argument shows that the Proposition holds if the hamiltonian flow exists only 
locally in time. 
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3.3 Equation of motion for observables and Poisson algebras 


Hence, we can replace H(t) by H in the Poisson bracket with/ and obtain equation (3.3.1) .n 

Remark 3.3.2 One can prove that /(t) = / o is the only solution to equation (3.3.1) , 
but that requires a little more knowledge about symplectic geometry (cf. Proposition 5.4.2 
and Proposition 5.5.2 of [MR99]). 

Conserved quantities The proof immediately leads to the notion of conserved quantity: 

Definition 3.3.3 (Conserved quantity/constant of motion) An observable f e C“(]R^'’,R) 
which is invariant under the flow 4> generated by the hamiltonian H e C“(]R^",]R), i. e. 

/(t)=/(0). 


or equivalently satisfies 

^/(t)={H,/(t)} = 0, 

is called conserved quantity or constant of motion. 

As is very often in physics and mathematics, we have completed the circle: starting from 
the Hamilton’s equations of motion, we have proven that the time evolution of observables 
is given by the Poisson bracket. Alternatively, we could have started by postulating 

^/(0 = {H,/(t)} 

for observables and we would have arrived at the Hamilton’s equations of motion by 
plugging in q and p as observables. 

Seemingly, to check whether an observable is a constant of motion requires one to solve 
the equation of motion, but this is not so. A very important property of the Poisson bracket 
is the following: 

Proposition 3.3.4 (Properties of the Poisson bracket) if 4>t is the flow associated toH & 
C^CR^"), then for any f,g,h& C^CR^"), the following statements hold true: 

(i) {•,■}: C“(r2") X C“(r2") — > C“(r2") 

(ti) {f,g} = -{g,f} (antisymmetry) 

(iiO = 

(iv) {f,{g,h}} + {h,{f,g}} + {g,{Kf}} = 0 (Jacobi identity) 

(v) {fg,h}=f{g,h} + g{f,h} (derivation property) 


2013.09.26 
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3 Classical mechanics 


Proof (i) {/, g} consists of products of derivatives of / and g, and thus their Poisson 
bracket is in the class 

(ii) The antisymmetry is obvious. 

(iii) The fact that the time evolution and the Poisson bracket commute is a very deep 
result of Hamiltonian mechancs (see e. g. [MR99, Proposition 5.4.2]), but that goes 
far beyond our current capabilities. 

(iv) This follows from either a straight-forward (and boring) calculation or one can use 
(iii): we compute the derivative of this equation and obtain 

= {H,{f,g}°^t} - {{H,/ o4>t},go4>t} - {/ o4>t,{H,go4>J} 

= {H,{/,g}o4>t} + {go4>t,{H,/ o4>J}-|-{/ o4>^,{go4>^,H}}. 

Setting t = 0 yields the Jacobi identity. 

(v) The derivation property follows directly from the product rule for partial derivatives 

and the definition of {■, ■}. □ 

Even though we cannot prove (iii) with our current means, under the assumption that the 
solution to equation (3.3.1) for / =0 is unique and given by /(t) = 0, we can deduce 
(iii) using the Jacobi identity (iv): {/, g} o 4>f = {/ o 4>t,g o 4>t} holds if this equality is 
true for t = 0 (which follows directly from 4>o = idj 52 n) and if the time derivative of this 
equality is satisfied. Hence, we compare 

= {H,{/,g}o4>t} 

which holds by Proposition 3.3.1 to 

^{f °^t,g°^t} = {{H,/ o4>t},go4>t} + {/ o4>t,{H,go4>J} 

= -{f °^tAg°^t,H}} - {go4>f,{H,/ o4>J} 

= +{H,{/ o4>t,go4>t}}. 

Hence, the difference A(t) := {/, g} ° 4’t — {/ ° g ° ^t} satisfies equation (3.3.1) , 

^A(t)={H,A(t)}, 

with initial condition A(0) = 0. We have assumed this equation has the unique solution 
A(t) = 0 which yields (iii). 
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3.4 Equivalence of Schrodinger and Heisenberg picture 


Proposition 3.3.4 (iii) simplifies computing the solution/(t) and finding constants of mo¬ 
tion; 

Corollary 3.3.5 (i) Equation (3.3.1) is equivalent to 

(ii) f is a constant of motion if and only if 

{H,f} = 0. 

Proof (i) Since H = H(t) is a constant of motion, the right-hand side of (3.3.1) can be 
written as 


= {H,/}°4>, 


using Proposition 3.3.4 (iii). 

(ii) This follows directly from (i) and the definition of constant of motion. □ 

Moreover, it turn out to be 

Definition 3.3.6 (Poisson algebra) Let V c C“(]R^") be a subalgebra of the smooth func¬ 
tions (i. e. V is closed under taking linear combinations and products). Moreover, assume 
that the Poisson bracket has the derivation property, {f g, h] = f {g,h} g {/, h], and 

{■,■} :VxV 

maps V xV into V. Then (V, {•, •}) is a Poisson algebra. 


3.4 Equivalence of Schrodinger and Heisenberg picture 

We are still missing the proof that Heisenberg and Schrodinger picture equally describe 
the physics. The main observation is that taking expectation values in either lead to the 
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same result, i. e. that for any observable / and state fj, 




dqdp (/r(t))(q,p)/(q,p) 

1 

dqdppo$_j(q,p)/(q,p) 

i 

dq dp p(q,p)/ o 4>t(q,p)det(D$f(q,p)) 




dqdpp(q,p)(/(t))(q,p) 


= E^(/(t)) 


holds. Note that the crucial ingredient in the step marked with * is again the Liouville 
theorem 3.2.2. Moreover, we see why states need to be evolved backwards in time. 


3.5 The inclusion of magnetic fields 

Magnetic fields can only be defined in dimension 2 or higher; usually, one considers the 
case n = 3. There are two main ways to include magnetic fields into classical mechanics, 
minimal substitution, and a more geometric way where the magnetic field enters into the 
geometric structure of phase space. Both descriptions are equivalent, though. 

The two-dimensional case can be obtained by restricting the three-dimensional case to 
the q^ q 2 -plane. 

The starting point from a physical point of view is the Lorentz force law: an electric field 
E and a magnetic field B(q) = (BjCq), B 2 (q),BjCq)) exert a force on a particle with charge 
e at q moving with velocity v that is given by 

FL = eE-|-vxeB. (3.5.1) 

For simplicity, from now on, we set the charge e = 1. 

The goal of this section is to include magnetic fields in the framework of Hamiltonian 
mechanics. Electric fields E = —appear as potentials in the Hamilton function 
H(q,p) = ^p^ + V[q), but magnetic fields are not gradients of a potential. Instead, 
one can express the magnetic field 


B = Vq X A 

as the curl of a vector potential A = (Aj, A 2 , A 3 ). 
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3.5 The inclusion of magnetic fields 


3.5.1 Minimal substitution 

The standard way to add the interaction to a magnetic field is to consider the equations 
of motion for position q and kinetic momentum 


p\p,ti) = P - Kq)- 

Moreover, p is replaced by = p — A in the Hamiltonian, 

H\q,p) :=H(q,p - A(q)). 
One then proposes the usual equations of motion: 


0 

+idi,3 


■idi 

0 








Taking the time-derivative of kinetic momentum yields 


(3.5.2) 


k=i 

k=l 

= ^ - A.) - i; 

k=l k=l 

3 


k=l 


If we set Bjp. := dg.Aj. — dq^^Aj, then the magnetic field matrix 


B ■ i^jk)l<j^k<3 


^ 0 “t“B3 —^2^ 

— B 3 0 +B;^ 

1^+62 -Bj 0 ) 


and use 


Bp = 


■B 2 A (PiA 


( 0 -I-B 3 

-B 3 0 -t-Bj I I P 2 

i^+B2 -Bj 0 j kPsJ 
^ B 3 P 2 — B 2 P 3 ^ 

-B 3 P 1 + B 1 P 3 =pxB, 
V ® 2 Pl ~ B1P2 j 


(3.5.3) 
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we can simplify the equation for to 

—p^ = X B = -V„H + q X B. 

dt 

For a non-relativistic particle with Hamiltonian H(q,p) = + l^(q), these equation 

reduce to the Lorentz force law (3.5.1) : 

P'^ 

/ = -v„y + — xB 

^ m 

= -(V,H)^ + B VpH^ = + B q 

Changes of gauge A choice of vector potential A is also called a choice of gauge. If 
X : —> R is a scalar function then A and 

A'=A+Vq7 

are both vector potentials associated to B, because x = 0, 

V, X A' = V, X (A+Vqx) 

= VgXA+VqXVqX 
= V, X A=B. 

Thus, we can either choose the gauge A or A', either one describes the physical situation: 
the equation of motion for p^ only involves B rather than A. Hence, gauges are usually 
chosen for convenience (e. g. a particular “symmetry” or being divergence free, • A = 0). 


3.5.2 Magnetic symplectic form 


A second way to include the interaction to a magnetic field is to integrate it into the 
symplectic form: 


B -idjsA ^ 

+idR3 0 J [pj [VpHJ 


(3.5.4) 


Note that neither the Hamiltonian nor the momentum are altered. Instead, in this variant, 
q is position and p is kinetic momentum. Solving the above equation for q and p yields 
(3.5.3) , 

l^+idj,3 0 J[PJ [ q J [^pH J 


'A = ( 1 

pJ {-VgH+BqJ {-VgH + qxBj 
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3.6 Sta bility an alysis 


In other words, we have again recovered the Lorentz force law (3.5.1) . 

From a mathematical perspective, the advantage of this formulation is that magnetic 
fields are always “nicer” functions than associated vector potentials. 


3.6 Stability analysis 


It turns out that hamiltonian systems are never stable: we start by linearizing the hamilto- 
nian vector field. 




+VpH(q,p)A 


DXuiq.p) 


V^^VpH(q,p) VjVpH(q,p) A 
-V[VqH(q,p) -VjV,H(q,p)j- 


Thus, the linearized vector field is always trace-less. 


TrDX„(q,p) = ^(5, 5p H(q,p) - = 0, 

;=i 

and using that the sum of eigenvalues of DX^{q,p') equals the trace of DX^{q,p'), 

we know that the (repeated according to their multiplicity) sum to 0, 


2n 

TrDX„(q,p) = 0 = 2Aj. 

;=i 


Moreover, seeing as the entries of DX[j[q,p) are real, the eigenvalues come in complex 
conjugate pairs Combined with the fact that DX[j(q,p) is 2n x 2n-dimensional, 

we deduce that if Xj is an eigenvalue of DX^(q,p), then so is —Xj. This suggests that 
hamiltonian systems tend to be either elliptic or hyperbolic. 


Electric fields and other gradient fields To better understand the influence interactions 
to electromagnetic fields (and other forces which can be expressed as the gradient of 
a potential) have to the dynamics of a particle, let us start with considering the purely 
electric case for a non-relativistic particle. Here, the interaction is given by the standard 
Hamiltonian 


Hiq,p)= ^P^ + V[q). 


which gives the interaction to the electric field E = —V^V. Here, the vector field 


XH<iq,P) 
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vanishes at (qo,0) where is a critical point of V, i. e. V„V(qo) = 0. Its linearization 


DXfiiq,p) 


0 


— idimn 


-HessV(q) 0 


involves the Hessian 




HessV = 






oi the potential. The block structure allows us to simplify the characteristic polynomial 
using 


det 


A B 
C D 


= detA det {D-BD~^C). 


Then the zeros of the characteristic polynomial 

Xq„W = det (AidR 2 „ - DX„(qo, 0)) 

= detf 

I+HessV(qo) ■^idgn 
= det(Aidgn) det (Aidgn + m~^ Hess V(qo)) 

= det (A^ idu-i + m~^ Hess V (qo)) 

are the square roots of the eigenvalues of —m“^ Hess V(qo). In case qg is a local maximum, 
for instance, then 


Hessy(qo) > 0 x • Hessy(qo)x > 0 Vxe] 


(3.6.1) 


holds in the sense of matrices; this equation is equivalent to requiring all eigenvalues coj 
of the Hessian to be positive. Hence, the eigenvalues of the linearized vector field are 


zJ^=±iJ^, 

\ m \m 

which means (qg, 0) is a marginally stable, elliptic fixed point. 


= ±1 


Magnetic fields In case only a magnetic field is present, i. e. we consider the Hamilton 
function H(q,p) = and the magnetic equations of motion (3.5.4) . The corresponding 
vector field 


XhAp) 


1 

m 
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3.6 Stability analysis 


linearizes to 


where 


DXH{.q,p) 


f 0 1^*3 A 

m \BXq)p B(q)j 


B'p:=VliBp). 


Any of the fixed points are of the form (qg, 0), so that we need to find the eigenvalues of 

. - ^ fo idiRS 

Using the block form of the matrix, we see right away that three eigenvalues are 0 while 
the others are, up to a factor of i/m, the eigenvalues of B(q): 


Iq„W :=det (Aidjje -DX„(qo, 0 )) 

A idjj.3 — — idj[3 

0 AidR3-iB(qo) 

= A^ det (Aid^s - ^BCqo)) 

The eigenvalues of B(qo) are 0 and ±i|B(qo)|. This can be seen from B = B = —B^, 
trB = 0 and detB = 0 which implies: (i) Aj = 0, (ii) A 2 = A 3 and A 2 + A 3 = 0. 

Hence, magnetic field have metastable, elliptic fixed points which are all of the form 
(qo, 0). That means, we are confronted with two problems: first of all, 4 of the eigenvalues 
are 0, so there are many metastable directions. The second one is much more serious: 
linearization is a local technique, and studying the stability via linearization hinges on 
the fact that you can separate fixed points by open neighborhoods. But here, none of 
the fixed points can be isolated from the others (in the sense that there does not exist an 
open neighborhood which contains only a single fixed point). So one needs more care: for 
instance, it is crucial to look at how the direction of B changes, looking at the linearization 
of the vector field is insufficient. 
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Chapter 4 

Banach & Hilbert spaces 


This section intends to introduce some fundamental notions on Banach spaces; those are 
vector spaces of functions where the notion of distance is compatible with the linear struc¬ 
ture. In addition, Hilbert spaces also allow one to introduce a notion of angle via a scalar 
product. 

Those notions are crucial to understand PDEs and ODEs, because these are defined on 
a domain (similar to domains of functions). Eor instance, one may ask: 

(i) How does the existence of solutions depend on the domain, e. g. by imposing differ¬ 
ent boundary conditions? 

(ii) How well can I approximate a solution with a given set of base vectors? This is 
important for numerics, because one needs to approximate elements of infinite¬ 
dimensional spaces by finite linear combinations. 


4.1 Banach spaces 

Many vector spaces X can be equipped with a norm H-H, and if they are complete with 
respect to that norm, the pair {^X, H-H) is a Banach space. 


4.1.1 Abstract Banach spaces 

The abstract definition of Banach spaces is quite helpful when we construct Banach spaces 
from other Banach spaces (e. g. Banach spaces of integrable, vector-valued functions).^ 

^ We will only consider vector spaces over C, although much of what we do works just fine if the field of scalars 
is R. 
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Definition 4.1.1 (Normed space) Let Xbea vector space. ^4 mapping H-H : X —> [0, +oo) 
with properties 

(i) \\x\\ = 0 if and only if X = 0, 

(ii) ||ax|| = |a| ||x||, and 


x + y\\ < l|x||+ y L 


(Hi) 


for all x,y & X, a G C, is called norm. The pair (A", 11 • 11) is then referred to as normed space. 
A norm on X quite naturally induces a metric by setting 


d(x,y): 


x-y 


for all x,y e A". Unless specifically mentioned otherwise, one always works with the 
metric induced by the norm. 

Definition 4.1.2 (Banach space) A complete normed space is a Banach space. 

Example The space A” = C([a, b],C) from the previous list of examples has a norm, the 
sup norm 



Since C([a, b],C) is complete, it is a Banach space. 

4.1.2 Prototypical Banach spaces: spaces 

The prototypical examples of Banach spaces are the so-called spaces or p-integrable 
spaces; a rigorous definition requires a bit more care, so we refer the interested reader to 
[LLOl] . 

When we say integrable, we mean integrable with respect to the Lehesgue measure 
[LLOl, p. 6 ff.]. For any open or closed set LI c R", the space of p-integrable functions 
is a C-vector space, but 





is not a norm: there are functions (p 7 ^ 0 for which ||ip|| = 0. Instead, ||(p|| = 0 only 


ensures 

(p(x) = 0 almost everywhere (with respect to the Lebesgue measure dx). 
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Almost everywhere is sometimes abbreviated with a. e. and the terms “almost surely” and 
“for almost all x e can be used synonymously. If we introduce the equivalence relation 

ip xp 11‘f’ ~ '•/’ll = 0) 

then we can define the vector space LP[Q): 


Definition 4.1.3 (L^'(D)) Let 1 < p < oo. Then we define 


:= ■ f :Tl —> C | / measurable, 


dx 


as the vector space of functions whose pth power is integrable. 


|/(x)|^ < ooj 

Then LP[TI) is the vector space 


LP(n) := £Pin)/ ~ 


consisting of equivalence classes of functions that agree almost everywhere. With the p norm 


dx |/(x)|' 


Vp 


it forms a normed space. 

In practice, one usually does not distinguish between equivalence classes of functions 
[/] (which make up L^(r2)) and functions f. This abuse of notation is pervasive in the 
literature and it is perfectly acceptable to write / e LP[Q) even though strictly speaking, 
one should write [/] e L^(r2). Only when necessary, one takes into account that / = 0 
actually means /(x) = 0 for almost all x e 

In case p = oo, we have to modify the definition a little bit. 

Definition 4.1.4 (L°°(D)) We define 

—> C I / measurable, 30 <K < oo : |/(^)| < K almost everywhere'^ 

to be the space of functions that are bounded almost everywhere and 

||/||^ := esssup|/(x)| := inf{i<l > 0 | |/(jr)| < K for almost all x e H}. 


Then the space L“(r2) := C°°{fT)/ ~ is defined as the vector space of equivalence classes 
where two functions are identified if they agree almost everywhere. 

Theorem 4.1.5 (Riesz-Fischer) For any 1 < p < oo, is complete with respect to the 

II'lip norm and thus, a Banach space. 


2013.10.08 
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Definition 4.1.6 (Separable Banach space) A Banach space X is called separable if there 
exists a countable dense subset. 


This condition is equivalent to asking that the space has a countable basis. 

Theorem 4.1.7 For any 1 < p < oo, the Banach space is separable. 

Proof We refer to [LLOl, Lemma 2.17] for an explicit construction. The idea is to approx¬ 
imate arbitrary functions by functions which are constant on cubes and take only values 
in the rational complex numbers. □ 


For future reference, we collect a few facts on L^(f2) spaces. In particular, we will make 
use of dominated convergence frequently. We will give them without proof, they can be 
found in standard text books on analysis, see e. g. [LLOl]. 

Theorem 4.1.8 (Monotone Convergence) Let [fk)keN ^ sequence of non-decreasing 
functions in L^(n) with pointwise limit f defined almost everywhere. Define 4 := dx4(x); 

then the sequence (4) is non-decreasing as well. If I := lim;j_,oo 4 < then I = dxf(xX 
i. e. 


r 


lim 

k—*oo 


%) 


dx4(x) 


dx lim fk[x) 

. k—*CiO 


dx/(x) 

•/ Cl 


holds. 


Theorem 4.1.9 (Dominated Convergence) Let (Jk)k€ 7 t beasequence of functions in L^[Q) 
that converges almost everywhere pointwise to some f : FI —> C. If there exists a non¬ 
negative g e L^(n) such that |4(x)| < g(x) holds almost everywhere for all fc e N, then g 
also bounds |/|, i. e. |/(^)| < g[x) almost everywhere, and f e L^[Fl). Furthermore, the 
limit k—^oo and integration with respect to x commute and we have 


lim 


dxfkM'- 


dx lim ACx) = 

, K^OO 


dx/(x). 


Example (First half of Riemann-Lebesgue lemma) We define the Fourier transform of 
/ eL^CR") as 


(. F /)(0 :: 


1 

(271)"/^ 


Jm" 


dx e-‘?-^/(x). 


The integrability of / implies that Xf is uniformly bounded: 


|(-F/X?)| < 


(271) 


n /2 


dx|e-‘^-^/(x)| = 


L'(K") 
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In fact, J^f is continuous, and the crucial tool in the proof is dominated convergence: to 
see that J^f is continuous in e R", let ((^„) be any sequence converging to Since 
we can bound the integrand uniformly in 

|e-‘?-/(x)|<|/(x)|. 


dominated convergence applies, and we may interchange integration and differentiation. 


lim(J'/XO= lim -r 


dxe-‘?-^/(x) 


( 271 )"/" 

1 

(271)"/" 


dx lim (e ‘^'^/(x)) 

1 

dxe-‘«»-/(x) = (J-/)(?o)- 


This means IFf is continuous in However, was chosen arbitrarily so that we have 
in fact Ff e L“(R'') nC(R"). 


4.1.3 Boundary value problems 

Now let us consider the wave equation 

d^u-d^u = 0 (4.1.1) 

in one dimension for the initial conditions 

u(x,0) = ip(x), 
uXx,0) = 'ip[x). 

This formal description defines only half of the wave equation, the other half is to state 
clearly what space u is taken from, and if it should satisfy additional conditions. A priori 
it is not clear whether u is a function of R or a subset of R, say, an interval [a, b]. The 
derivatives appearing in (4.1.1) need only exist in the interior of the spatial domain, 
e. g. (a, b). Moreover, we could impose integrability conditions on u, e. g. u e L^(R). 

If u is a function [0, L] to C, for instance, it turns out that we need to specify the 
behavior of u or u' at the boundary, e. g. 

(i) Dirichlet boundary conditions: u(t, 0) = 0 = u(t, L) 

(ii) Neumann boundary conditions: 5^u(t,0) = 0 = 3,^u(t, L) 

(iii) Mixed or Robin boundary conditions: aQu[t,0)+j3Qdj^u[t,0) = 0, aiu[t,L)+Pidj^u[t,L) 
0 
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If one of the boundaries is located at ±oo, then the corresponding boundary condition 
often becomes meaningless because “u(+oo, T)” usually makes no sense. 

Let us start by solving (4.1.1) via a product ansatz, i. e. we assume u is of the form 

u(x,t) = T(t)^(x) 

for suitable functions ^ and t. Plugging the product ansatz into the wave equation and 
assuming that T(t) and i^(x) are non-zero yields 

„ T(t) ru) 

f(t)?(x)-T(t)rW = 0 ^ = ^ = 

T(t) ?(x) 

This means t and ^ each need to satisfy the harmonic oscillator equation, 

f — At = 0, 

- A? = 0. 


Note that these two equations are coupled via the constant A which has yet to be deter¬ 
mined. The solutions to these equations are 


and 


T(t) 


ciiCO)-I- 02 ( 0 ) t A = 0 

ai(A)e+''^-ba2(A)e-f'^ A/0 




bi(0)-I-b2(0)^ A = 0 

bi(A)e+^'^-l-b2(A)e-^'^ At^O 


for UiCA), bi(A), b 2 (A) e C where we always choose the root whose imaginary part 
is positive. These solutions are smooth functions on R x R. 


The wave equation on R As mentioned in problem 4 on sheet 1, conditions on u now 
restrict the admissible values for A. For instance, if we ask that 

u G L“(R X R), 

then this condition only allows A < 0 and excludes the linear solutions, i. e. 02 ( 0 ) = 0 = 

^’2(0)- 
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The wave equation on [0, L] with Dirichiet boundary conditions Now assume we are in¬ 
terested in the case where the wave equation is considered on the interval [0, L] with 
Dirichiet boundary conditions, u(t,0) = 0 = u(t,L). It turns out that the boundary con¬ 
dition only allows for a discrete set of negative A: one can see easily that for A = 0, only 
the trivial solution satisfies the Dirichiet boundary conditions, i. e. ai(0) = a2(0) = 0 and 

bi(0) = = 0. 

For A 7 ^ 0, the first boundary condition 

?,(0) = ai(A)e+°'^ + a2(A)e-°'^ 

= ai(A)-|-a2(A) = 0 

implies a 2 (^) = Plugging that back into the second equation yields 

?,(L) = ai(A)(e+"'^-e-^'^)=0 

which is equivalent to 

e2i^=l. 

The solutions to this equation are of the form 

/— rni 
L 

for some n e N, i. e. A < 0. Moreover, as discussed in problem 4, the only admissible 
solutions of the spatial equation are 

^„(x) = sinn^, n e N. 

That means, the solutions are indexed by an integer n e N, 

u„(t,x)= (uiWe+^f+a 2 (n)e--T) sinnf, 

and a generic solution is of the form 

u(t,x) = 2[ai(n)e+‘"f-Pa 2 (n)e-‘''f) sinnf. (4.1.2) 

/leN 

To obtain the coefficients, we need to solve 

u(0,x) = ^(ujCn) -I- a 2 (n)) sinn^ = ifix) 

neN 

3tu(0,x) = ^ —— (ujCn) - a 2 (n)) sinn^ = t/>(x). 

n€N ^ 
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Later on, we will see that all integrable functions on intervals can be expanded in terms 
of for suitable choices of A (cf. Section 6). Hence, we can expand 

(p(x) = ^b^(n) sinn^ 

neN 

i’M = '^b^in) sinn^ 

neN 

in the same fashion as u, and we obtain the following relation between the coefficients: 


= a-^in) + a2inj 
inn 

brpin) = -j- (ai(n) - a2(n)} 

For each n, we obtain two linear equations with two unknowns, and we can solve them 
explicitly. Now the question is whether the sum in u(t, ■) is also integrable: if we assume 
for now that the coefficients of (p and xp are absolutely summable^, 

^|ai(n) + a 2 (n)| < oo, 

neN 

2|n||ai(n)-a2(n)| < 00, 

neN 


then we deduce 


2|ai(n)| = |ai(n) + ai(n)| 

= |ai(r!) - azCn) + a2(n) + ai(r!)| 

< |ai(n) + a2(n)| + |ai(n) - a2(n)| 

< |ai(n) + a 2 (n)| + |n| |ai(n) - a 2 (n)|. 

Thus, the expression on the right-hand side of (4.1.2) converges in L^([0, L]), 


|u(x, t)| 


< 


S 


(ai(n) 


e"*"' i ^ 4- a 2 [n) e * i sinn 


7 TX 

L 


neN 


- l‘^2(rf)l) < 00- 

neN 


Since the bound is independent of t we deduce u(t, •) exists for all t e R. 

Overall, we have shown the following: if we place enough conditions on the initial 
values u(0) = tfi and 5tu(0) = xp (here: ip,xp & L^([0,L]) and absolutely convergent 
Fourier series), then in fact u(t) e L^([0, L]) is integrable for all times (bounded functions 
on bounded intervals are integrable). 

^In Chapter 6 we will give conditions explicit conditions on ip and which will ensure the absolute summability 
of the Fourier coefficients. 
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4.2 Hilbert spaces 

Hilbert spaces H are Banach spaces with a scalar product {■ ,■) which allows to measure 
the “angle” between two vectors. Most importantly, it yields a characterization of vectors 
which are orthogonal to one another, giving rise to the notion of orthonormal bases (ONB). 
This type of basis is particularly efficient to work with and has some rather nice properties 
(e. g. a Pythagorean theorem holds). 


4.2.1 Abstract Hilbert spaces 


First, let us define a Hilbert space in the abstract, starting with 

Definition 4.2.1 (pre-Hilbert space and Hilbert space) A pre-Hilbert space is a complex 
vector space H with scalar product 


i. e. a mapping with properties 

(i) {if, If) >0 and (<p, ip) = 0 implies ip = 0 (positive definiteness), 

(ii) {ip,ip)* = {xp,p), and 

(in) ((p,ai/)-F j) = a((p,t/))-F (ip,x) 

for all ip,ip,x £ Tf and a e C. This induces a natural norm ||(p|| := -y/ {tp, p) and metric 
d(p,%p) := ||(p — t/i||, p,%p ^TL. IfTL is complete with respect to the induced metric, it is a 
Hilbert space. 

The scalar product induces a norm 

||/|| 

and a metric d(f,g) := ||/ — g||- 
Example (i) C" with scalar product 


(z,w) :=^z*Wj 


(=1 


is a Hilbert space. 

(ii) C([a, b],C) with scalar product 


rb 


if’S) ■= 


dxf(x)*g(x) 


is just a pre-Hilbert space, since it is not complete. 
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4.2.2 Orthonormal bases and orthogonal subspaces 

Hilbert spaces have the important notion of orthonormal vectors and sequences which do 
not exist in Banach spaces. 

Definition 4.2.2 (Orthonormal set) Let I be a countable index set. A family of vectors 
{ifiklkei called orthonormal set if for all k,j e Z 

= 5kj 


holds. 


As we will see, all vectors in a separable Hilbert spaces can be written in terms of a 
countable orthonormal basis. Especially when we want to approximate elements in a 
Hilbert space by elements in a proper closed subspace, the vector of best approximation 
can be written as a linear combination of basis vectors. 


Definition 4.2.3 (Orthonormal basis) Let I be a countable index set. An orthonormal set 
of vectors {^Pk}kex ^ called orthonormal basis if and only if for all \p ^'LL, we have 

'4’ = '^{'-Pk,'4’)L>k- 

kel 

If 2 is countably infinite, X = N, then this means the sequence := '•/’) 

partial converges in norm to ip. 


lim 

n-^oo 




= 0 


Example Hermitian matrices H = H* & Mat^Cn) give rise to a set of orthonormal vectors, 
namely the set of eigenvectors. To see that this is so, let be an eigenvector to (the 
eigenvalues are repeated according to their multiplicity). Then for all eigenvectors Vj and 
vj^, we compute 

0 = {vj,Hv^) - {vj,Hv^) = {vj,v^) - {H*Vj,v^) 

= K {vj,Vk) - (HVj, v^) = - Aj) {vj,v^) 

where we have used that the eigenvalues of hermitian matrices are real (this follows 
from repeating the above argument for j = n). Hence, either (vj,v^ = 0 if A^ 7 ^ A„ or 
Aj = An. In the latter case, we obtain a higher-dimensional subspace of C" associated 
to the eigenvalue A„ for which we can construct an orthonormal basis using the Gram- 
Schmidt procedure. 
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Thus, we obtain a basis of eigenvectors This basis is particularly convenient 

when working with the matrix H since 

n 

Hw = H ^ Vj 

;=i 

n 

J=1 

We will extend these arguments in the next chapter to operators on infinite-dimensional 
Hilbert spaces where one needs to take a little more care. 

With this general notion of orthogonality, we have a Pythagorean theorem: 

Theorem 4.2.4 (Pythagoras) Given a finite orthonormal family {ipj,..., in a pre- 
Hilbert space H and ip we have 

II ir=sLi I (‘Pit’‘p) r+II‘p - liLi (‘Pfc’‘p)‘Pit If- 

Proof It is easy to check that ip := P) Pic ip-^ := ip - Y!k=i(^k, P> Pfc are 

orthogonal and ip = ipip'^. Hence, we obtain 

||p|r = (P’P) = {'ip + i)-'-,i) + Ip-'-) = {ip,Ip) + {ip-'-,Ip-'-) 

= ||liLi(Pit> p) Pitir + l|p “ XiLi(Pfc’ p) Pitlf- 

This concludes the proof. □ 

A simple corollary are Bessel’s inequality and the Cauchy-Schwarz inequality. 

Theorem 4.2.5 Let Li be a pre-Hilbert space. 

(i) Bessel’s inequality holds: let {ipj,... be a finite orthonormal sequence. Then 

n 

\\ip\? > 

;=i 

holds for all ip & Li. 

(ii) The Cauchy-Schwarz inequality holds, i. e. 

\{T,i’)\<M\m 

is valid for all ip, ip ^TL 
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Proof (i) This follows trivially from the previous Theorem as >0. 

(ii) Pick (p,tp In case v? = 0, the inequality holds. So assume if ^0 and define 

V’l := 11^ 

which has norm 1. We can apply (i) for n = 1 to conclude 

This is equivalent to the Cauchy-Schwarz inequality. □ 

An important corollary says that the scalar product is continuous with respect to the norm 
topology. This is not at all surprising, after all the norm is induced by the scalar product! 

Corollary 4.2.6 Let Hbe a Hilbert space. Then the scalar product is continuous with respect 
to the norm topology, i. e. for two sequences [^Pn)n€n ('0m)meN converge to (p and 
Ip, respectively, we have 


lim 

n,m—*oo 

Proof Let [pn)n€N (i/’m)m€N t>e two sequences in TL that converge to p and tp, re¬ 
spectively. Then by Cauchy-Schwarz, we have 

lim \{p,xp)-{p„,ipj\= lim Up - p„,ip) - {ip„,xp,„-xp)\ 

n,m—*oo> ' n,m—*oo> ' 

< lim \{p-p„,-ip)\+ lim i>)\ 

n,m—*oo' ' n,m—*oo' > 

< lim ||(/> - ll'f/'ll-I- lim ||i/)„|| - 1 /)|| = 0 

n,m—*oo n,m—*oo 

since there exists some C > 0 such that ||i/)„|| < C for all n e N. □ 

Before we prove that a Hilbert space is separable exactly if it admits a countable basis, we 
need to introduce the notion of orthogonal complement: if A is a subset of a pre-Hilbert 
space TL, then we define 

:= {p & TL \ {p,xp) = 0 yip eA}. 

The following few properties of the orthogonal complement follow immediately from its 
definition: 

(i) {0}-L =-H and= {0}. 
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(ii) is a closed linear subspace of "H for any subset A c H. 

(iii) IfACS, thenS-L ca-L. 

(iv) If we denote the sub vector space spanned by the elements in A by span A, we have 

A"*" = (spanA)"*" = (spanA)"*" 

where spanA is the completion of spanA with respect to the norm topology. 


4.2.3 Prototypical Hilbert spaces 


We have already introduced the Banach space of square-integrable functions on R", 


C 


r 

’(n) := ■ :Q —> C I (p measurable, dx |(p(x)|^ < ooj, 


Jn 


and this space can be naturally equipped with a scalar product, 


{f>s) = 


dx/(x)*g(x). 


(4.2.1) 


When talking about wave functions in quantum mechanics, the Born rule states that 
x)| is to be interpreted as a probability density on Q. for position (i. e. ip is nor¬ 
malized, ||t))|| = 1). Hence, we are interested in solutions to the Schrodinger equation 
which are also square integrable. If t/ji ~ are two normalized functions in £^(0), then 
we get the same probabilities for both: if A c H c R" is a measurable set, then 


Pi(X e A) 


dx |'i/>i(x)|^ 
J A 


dx |i/) 2 (x)|^ =P 2 (X e A). 
J A 


This is proven via the triangle inequality and the Cauchy-Schwartz inequality: 


0 < |Pi(X e A) - P2(X e A)| = 


dx |'(/>i(x)| - dx \ip 2 M\ 


dx (t/)i(x) - i/> 2 U))* i/’iU) + 


dx ip 2 M* (t/)i(x) - xI) 2 M) 


< 


dx|'(/>i(x)-t/)2U)| + <ix\'ip2[x)\\xp^[x)-\p2[x)\ 

A JA 

^Iki-^2|| lkill+lk2|i iki-^2|i=o 


Very often, another space is used in applications (e. g. in tight-binding models): 


2013.10.15 
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Definition 4.2.7 ^(S)) Let S be a countable set. Then 

£^(S) := |c : S — > C | Xijes 

is the Hilbert space of square-summahle sequences with scalar product {c,c') := Xjes 

On P(_S) the scalar product induces the norm ||c|| := ^{c,c). With respect to this norm, 
£^(S) is complete. 

4.2.4 Best approximation 

If ("H, d) is a metric space, we can define the distance between a point (p and a subset 
A^n as 

dip, A) := inf dip, xp). 

i/ieA 

If there exists Pq &A which minimizes the distance, i. e. dip,A) = dip, p^), then p^ is 
called element of best approximation for p in A. This notion is helpful to understand why 
and how elements in an infinite-dimensional Hilbert space can be approximated by finite 
linear combinations - something that is used in numerics all the time. 

If A c "H is a convex subset of a Hilbert space TL, then one can show that there always 
exists an element of best approximation. In case A is a linear subspace of TL, it is given by 
projecting an arbitrary i/) e "H down to the subspace A. 

Theorem 4.2.8 Let Abe a closed convex subset of a Hilbert space TL. Then there exists for 
each p &TL exactly one Pq^A such that 

dip,A) = dip,po). 

Proof We choose a sequence in ^4 with dip,ip = ||(/3 — ^ dix,A). This 

sequence is also a Cauchy sequence: we add and subtract p to get 

\\i^n-p) + ip-^m)f- 

If TL were a normed space, we could have to use the triangle inequality to estimate the 
right-hand side from above. However, "H is a Hilbert space and by using the parallelogram 
identity^ we see that the right-hand side is actually equal to 

||t/^n - = 2||t/>„ - V’lr + 2.\\xp,n -‘^11^“ \\^n + V’m “ 2¥>||^ 

= 2||t/'n - + 2||'0m -P^- 4|| + t/'m) - P^ 

< 2||t/'n - <^|r + 2||'0m - <^ir - '^dip,A) 

2dip,A) -f 2dip,A) - 4dip,A) = 0. 

^For all 6 "H, the identity 2 ||(p|p -F 2 ||-(/)|p = ||i^ -F -I- ||i^ - holds. 
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By convexity, again an element of ^4. This is crucial once again for the 

uniqueness argument. Letting n,m oo, we see that ^ Cauchy sequence in 

A which converges in A as it is a closed subset of Ti. Let us call the limit point (po := 
lim„_oo t/'n- Then ipo is an element of best approximation, 

llcp - (Poll = lim ||(p - ipA\ = diip,A). 

II II n—>oo'' II 

To show uniqueness, we assume that there exists another element of best approximation 
(Po s A- Define the sequence [ipn)neN t>y '4’2n •= ’fo for even indices and ip 2 n+i ■— ‘P'o 
for odd indices. By assumption, we have ||(p — (po|| = d[ip,A) = ||(p — (p' || and thus, by 
repeating the steps above, we conclude [ipn)n€N a Cauchy sequence that converges to 
some element. However, since the sequence is alternating, the two elements cp' = (po are 
in fact identical. □ 

As we have seen, the condition that the set is convex and closed is crucial in the proof. 
Otherwise the minimizer may not be unique or even contained in the set. 

This is all very abstract. For the case of a closed subvector space E Q'H,we can express 
the element of best approximation in terms of the basis: not surprisingly, it is given by the 
projection of cp onto E. 

Theorem 4.2.9 Let E be a dosed subspace of a Hilbert space that is spanned by count¬ 
ably many orthonormal basis vectors {^fklkex- Then for any cp e "H, the element of best 
approximation cpo e E is given by 


'To = '^{^k,‘-P)Tk- 

keX 

Proof It is easy to show that cp — (po is orthogonal to any ip = Stei ^kTk^ T: we focus 
on the more difficult case when E is not finite-dimensional. Then, we have to approximate 
(Po and Ip by finite linear combinations and take limits. We call cpg"^ := 2fc=i(V’)c>¥’) Tk 
and := XiHi Vi- With that, we have 

= {‘T -lTk=i{‘Tk,4’) ^i) 

m n m 

/ = 1 k=l 1 = 1 

m 

= ^ (l - ULi ^ki) ■ 

;=i 

By continuity of the scalar product. Corollary 4.2.6, we can take the limit n,m—*oo. The 
term in parentheses containing the sum is 0 exactly when Z e {1,..., m} and 1 otherwise. 
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Specifically, if n>m, the right-hand side vanishes identically. Hence, we have 
{ip - >Po,xl)) = lim =0, 

n,m—*oo ^ 

in other words ip — ip^^E^. This, in turn, implies by the Pythagorean theorem that 

Ik “ =Ik “ ‘^oir+iko “ ^ir - Ik “ ‘^oir 

and hence ||(p — (po|| = d{p,E). Put another way, ipo is an element of best approximation. 
Let us now show uniqueness. Assume, there exists another element of best approximation 
‘Po “ Sjcei-^fc ‘Pfc- Then we know by repeating the previous calculation backwards that 
(p — <Pq e £-*■ and the scalar product with respect to any of the basis vectors which span 
E has to vanish, 

0= ((p;„(p- (Po) = {(p;t,ip> -^A;{ip;i,(p,) = - 

lel !el 

= {^k, - K- 

This means the coefficients with respect to the basis {Pk\k&i agree with those of (pg. 
Hence, the element of approximation is unique, (pg = (pg, and given by the projection of 
ip onto E. □ 

Theorem 4.2.10 Let E be a closed linear subspace of a Hilbert space H. Then 

(i) % = E® £■*■, i. e. every vector p ^TL can be uniquely decomposed as p = 'ip + tp-^ with 
t/j e £, t/j-L e E-^. 

(ii) E-^-^ = E. 

Proof (i) By Theorem 4.2.8, for each p there exists Po & E such that d[p,E) = 

d{p, Pq). From the proof of the previous theorem, we see that <pg := (p — <po e 
Hence, ip = (pg -I- ipg is a decomposition of ip. To show that it is unique, assume 
(Pg + (pg"*" = (p = (pg -F (pg is auother decomposition. Then by subtracting, we are led 
to conclude that 

holds. On the other hand, £ n £-*- = {0} and thus ipg = ip' and p^ = (pg"*", the 
decomposition is unique. 

(ii) It is easy to see that E c Let p e £■*"*■. By the same arguments as above, we 
can decompose p e £-*"*■ c % into 

P = Pq + P^ 

with (pg e £ c £-*--*- and p^ e £-*-. Hence, ip — ipg e £-*"*■ n £-*■ = (£"*■)■*■ n £-*■ = {0} 
and thus (p = (pg e £. □ 
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Now we are in a position to prove the following important Proposition: 

Proposition 4.2.11 A Hilbert space H is separable if and only if there exists a countable 
orthonormal basis. 

Proof <J=: The set generated by the orthonormal basis ^ countable, and coeffi¬ 

cients z = q + ip, q,p e Q, is dense in H, 

e "H I N 3 n < |I|, e {‘Pk}k€N, Zj = 4] + qj,Pj ^ q}- 


Assume there exists a countable dense subset V, \. e.V = H. If H is finite dimen¬ 
sional, the induction terminates after finitely many steps and the proof is simpler. Hence, 
we will assume H to be infinite dimensional. Pick a vector e I? \ {0} and normalize 
it. The normalized vector is then called ipj. Note that need not be in V. By Theo¬ 
rem 4.2.10, we can split any xp & V into xp^ and xp^ such that xp^ e spanfipj} := E^, 
xp-^ e spanfip]^}-*- := and 

xp = xp^+xp^. 

pick a second (^2 ^ ^ \ (which is non-empty). Now we apply Theorem 4.2.9 (which 
is in essence Gram-Schmidt orthonormalization) to (^ 2 , i- e. we pick the part which is 
orthogonal to <pi, 

‘P2~‘P2-(<Pi>‘P2>‘P2 


and normalize to <p 2 , 


‘f2 ■ = 


IP2 


11‘P^ir 


This defines E 2 := span{(pi, (^ 2 } and EL = E 2 ® E^. 

Now we proceed by induction: assume we are given £„ = span{(pi,...,(p„}. Take 
(p^+i e I? \ and apply Gram-Schmidt once again to yield which is the obtained 
from normalizing the vector 


n 

¥>^+1 := ifin+l - '^{Ek, En+l) Ek- 
k=l 


This induction yields an orthonormal sequence {(Pn}neN which is by definition an orthonor¬ 
mal basis of E^ := span{(pn}ngp^ a closed subspace of EL. If E^ c EL, we can split the 
Hilbert space into EL = E^® E^. Then either I? n ("H \ Eqo) = ^ “ in which case V can¬ 
not be dense in H - or I? n ("H \ E^) / 0. But then we have terminated the induction 
prematurely. □ 
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4.2.5 Best approximation on L^{[—n,+n]) using {e+'“}„g 2 

Let us consider the free Schrddinger equation 

i5t'i/j(t) = -5^^i/)(t), '0(O) = t/)oeL^([-7r,+7i]), (4.2.2) 


for a suitable initial condition tpQ (we will be more specific in a moment). Since [—ti, +7r] 
has a boundary, we need to impose boundary conditions. We pick periodic boundary 
conditions, i. e. we consider functions for which = xp^+n). It is often convenient 

to think of functions with periodic boundary conditions as 27r-periodic functions on R, 
i. e. t/)(x + 271) = t/'Cx). 

Periodic functions are best analyzed using their Fourier series (which we will discuss 
in detail in Chapter 6), i. e. their expansion in terms of the exponentials To 

simplify computations, let us choose a convenient prefactor for the scalar product. 




1 


r t-Ji 

dx ifix)* xpix). 

J —71 


Then a quick computation yields that is indeed an orthonormal set: 


(e+y^,e+‘“) 


1 


dx e 


,+i(n-j)x 


1 j = n 
0 j^n 


As we will see later, this set of orthonormal vectors is also basis, and we can expand any 
vector xp in terms of its Fourier components, 

,/,(x) = 2(e+‘“,t/>) e+‘“. 

neZ 


l]smgtheproductansatzxp[t,x) = T(t)(p(x) (separation of variables) from Chapter 4.1.3, 
we obtain two coupled equations: 

iT(t)(p(x) = -T(t)(p"(x) ^ = 

T(t) (p(x) 

The periodic boundary conditions = (p(+7r) as well as the condition (p e L^([—ti,+ 7r]) 

eliminates many choices of A e C. The equation for ip is just a harmonic oscillator equa¬ 
tion, and the periodicity requirement means that only A = n^ with n e Nq = {0,1,...} are 
admissible. The equation for t can be obtained by elementary integration, 

iTn(0 = rt^T„(t) ^ T„(t) = T„(0)e“‘""h 
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Hence, the solution to A = is a scalar multiple of e and the solution to (4.2.2) 

can formally be written down as the sum 

neZ 

A priori it is not clear in what sense this sum converges. It turns out the correct notion of 
convergence is to require the finiteness of 

neZ 

=ikoir <00 

neZ 

This condition, however, is automatically satisfied since we have assumed i/jq is an element 
of L^([—7r,+7i]) from the start. In other words, the dynamics even preserve the L^-norm. 
In the next section, we will see that this is not at all accidental, but by design. 

Now to the part about best approximation: one immediate idea is to allow only e 
{0,1, ...,iV}, because for Snez|| to converge, a necessary condition is that 

|(e+*“,t/.o)r-0 as n ^ 00 . So for any upper bound for the error e > 0 (which we shall 
also call precision), we can find iV(e) so that the initial condition can be approximated in 
norm up to an error e, 

+N(e) 

Ikoir- 2 |(e+‘“ t/>o)r= 2 V^o)r<^^- 

n=—N(e) |n|>Ar(e) 

Since all the vectors are orthonormal, the vector of best approximation is 

+N(£) 

V^bestU)= 2 (e+‘“,i/>o)e+‘“, 

n=—N{e) 

and we can repeat the above arguments to see that the time-evolved stays e-close 

to Ip[t) for all times in norm, 

liv’d)-t/>best(t)|r= 2 |e--^^(e+’-,t/io)r 

|/i|>IV(e) 

= Xi |(e+‘“ V’o)r<^=^. 

\n\>N{e'j 

In view of the Gronwall lemma 2.2.6, one may ask why the two are close for all times. 
The crucial ingredient here is linearity of (4.2.2) . 


57 






4 Banach & Hilbert spaces 


2013.10.17 


4.3 Linear functionals, dual space and weak convergence 

A very important notion is that of a functional. We have already gotten to know the free 
energy functional 

Efree :2?(£free) ^ [0,+Oo) C C, 

1 '•if 

^ •EfreeCV’) = ^ ‘Z’)) • 

1=1 

This functional, however, is not linear, and it is not defined for all ip e L^(]R"). Let us 
restrict ourselves to a smaller class of functionals: 

Definition 4.3.1 (Bounded linear functional) Let X be a normed space. Then a map 

L : A'—>C 

is a bounded linear functional if and only if 

(i) there exists C > 0 such that |L(x)| < C ||x|| and 

(ii) L[x + py) = L[x) + pL[y) 
hold for all x,y & X and /t e C. 

A very basic fact is that boundedness of a linear functional is equivalent to its continuity. 

Theorem 4.3.2 Let L : X —> C be a linear functional on the normed space X. Then the 
following statements are equivalent: 

(i) L is continuous at Xq e X. 

(ii) L is continuous. 

(in) L is bounded. 

Proof (i) (ii): This follows immediately from the linearity. 

(ii) ^ (iii): Assume L to be continuous. Then it is continuous at 0 and for e = 1, we can 
pick 5 > 0 such that 

\L(x)\<e = l 

for all X e A' with ||x|| < 5. By linearity, this implies for any y & X \ {0} that 
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Hence, L is bounded with bound i/s, 

|Uy)|<|||y||. 

(iii) ^ (ii): Conversely, if L is bounded by C > 0, 

|iU)-i(y)| < C||x-y||, 

holds for all x,y & X. This means, L is continuous: for e > 0 pick 5 = «/c so that 

\LM-L[y)\ <c||x-y|| <C§ = £: 

holds for all x,y ^X such that ||x — y|| < Vc. □ 

Definition 4.3.3 (Dual space) Let X be a normed space. The dual space X' is the vector 
space of bounded linear functionals endowed with the norm 

|iWI 

11^11*:= sup —— = sup |L(x)|. 
xen-Mo} ll^ll xsx 
\\x\\=l 

Independently of whether X is complete, X' is a Banach space. 

Proposition 4.3.4 The dual space to a normed linear space X is a Banach space. 

Proof Let be a Cauchy sequence in X', i. e. a sequence for which 


We have to show that (L„)neN converges to some L e X'. For any e > 0, there exists 
N{e)&N such that 

~ ^ ^ 

for all k,j > N[e). This also implies that for any x &X, (L„(x))„gj^ converges as well, 
|L;,(x)-Lj(x)| < ||x|| <e||x||. 

The field of complex numbers is complete and (L„(x)),jgj^ converges to some L(x) e C. 
We now define 

L(x) := lim Ln(x) 
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for any x & X. Clearly, L inherits the linearity of the The map L is also bounded: 

for any e > 0, there exists N[e) e N such that ||Lj — L„||^ < s for all j,n > iV(e). Then 

|(L - L„Xx)| = |(Lj - L„Xx)| < j^\\Lj - Ln||* ||x|| 

< e||x|| 

holds for all n > iV(e). Since we can write L as L = L„ + (L — L„), we can estimate the 
norm of the linear map L by ||L||* < ||L„||* + £< 00 . This means L is a bounded linear 
functional on X. □ 

In case of Hilbert spaces, the dual 'll' can be canonically identified with H itself: 

Theorem 4.3.5 (Riesz’ Lemma) Let %be a Hilbert space. Then for all L ^TL' there exist 
tpi such that 


L((p)= 


In particular, we have ||L||^, = ilX>£||. 

Proof Let kerb := {<p eTf | L[^p) = 0} be the kernel of the functional L and as such is a 
closed linear subspace of "H. If kerb = TL, then 0 e "H is the associated vector. 


b(<p) = 0 = {0,(p). 


So assume kerb ^TL is a proper subspace. Then we can split TL = kerb © (kerb)-*-. Pick 
ipo £ (kerb)-*-, i. e. b((po) X 0- Then define 


■= 7]— 

WToW 

We will show that b(<p) = If ip e kerb, then b((p) = 0 = One easily 

shows that for ip = a <po, a e C, 

T(‘P) = i(“<Po) = ai(‘Po) 

= {i’L,T)={^To,a^o) 

= o-UTo) nf = clL^To)- 

\Wo\\ 

Every if &TL can be written as 


_ b((p) T b((p) 
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Then the first term is in the kernel of L while the second one is in the orthogonal comple¬ 
ment of kerL. Hence, = {ipi, for all e "H. If there exists a second xp'^ e 'LL, then 
for any ip &'LL 

0 = L(vj) - L(vj) = {ipi, ip) - {xp^, p) = - ip[, p). 

This implies xp'^ = xpi and thus the element xpi is unique. 

To show ||L||* = i|t/)i||, assume L 7 ^ 0. Then, we have 

||L|U= sup |l((^)|>|l(^)| 

lkll=i 

On the other hand, the Cauchy-Schwarz inequality yields 

||L||*= sup \L[p)\= sup \{xpi,p)\ 

lkll=i lkll=i 

< sup i|t/)i||||vj|| = \\xPl\\. 

Ikll=i 

Putting these two together, we conclude ||L||* = ||t/)£||. □ 

Definition 4.3.6 (Weak convergence) Let X bea Banach space. Then a sequence (Xn)ngj^ 
in X is said to converge weakly to x & X if for all L & X' 

L(xJ — L(x) 

holds. In this case, one also writes x„ ^ x. 

Weak convergence, as the name suggests, is really weaker than convergence in norm. The 
reason why “more” sequences converge is that, a sense, uniformity is lost. If T' is a Hilbert 
space, then applying a functional is the same as computing the inner product with respect 
to some vector xp^. If the “non-convergent part” lies in the orthogonal complement to 
{xpij, then this particular functional does not notice that the sequence has not converged 
yet. 

Example Let "H be a separable infinite-dimensional Hilbert space and an ortho¬ 

normal basis. Then the sequence [pn'lnen does not converge in norm, for as long as n^k 

||‘Pn-‘P/c|| = 

but it does converge weakly to 0: for any functional L = {xpi,-),we see that (|k(<Pn)|)neN 
is a sequence in R that converges to 0. Since ^ basis, we can write 

00 

'4’L = '^{Tn,'4’L)Tn 
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and for the sequence of partial sums to converge to ipi, the sequence of coefficients 

((V’n)'’/’L))n€N ~ )neN 

must converge to 0. Since this is true for any L e H', we have proven that ^ 0 
(i. e. ipn ^ 0 weakly). 

In case of A” = there are three basic mechanisms for when a sequence of functions 

(Ji^) does not converge in norm, but only weakly: 

(i) oscillates to death: take = sin(fcx) for 0 < x < 1 and zero otherwise. 

(ii) /j. goes up the spout: pick g e L^(R) and define := k^^’’ g[kx). This sequence 
explodes near x = 0 for large k. 

(hi) /j. wanders off to infinity: this is the case when for some g e we define 

fkM := g(x + fc). 

All of these sequences converge weakly to 0, but do not converge in norm. 
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Chapter 5 

Linear operators 


Linear operators appear quite naturally in the analysis of linear PDEs. Many PDEs share a 
common structure: for instance, the Schrodinger equation 

i5t'i/j(t)= (-A + y)t/)(t), = (5.0.1) 

and the heat equation 

=-(-A + V)i/>(t), ip{_0) = ipo, (5.0.2) 

both involve the same operator 


H=-A+V 

on the right-hand side. Formally, we can solve these equations in closed form, 

ipit) = 

solves the Schrodinger equation (5.0.1) while ip[t) = solves the heat equation, 

because we can formally compute the derivative 

and verify that also the initial condition is satisfied, ip(_0) = e°ipQ = ipQ. A priori, these 
are just formal manipulations, though; If H were a matrix, we know how to give rigorous 
meaning to these expressions, but in case of operators on infinite-dimensional spaces, this 
is much more involved. However, we see that the dynamics of both, the Schrodinger and 
the heat equation are generated by the same operator H. 
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As we will see in Chapter 5.5, also the Maxwell equations can be recast in the form (5.0.1) . 
This gives one access to all the powerful tools for the analysis of Schrodinger operators in 
order to gain understanding of the dynamics of electromagnetic waves (light). 

Moreover, one can see that the selfadjointness of H =H* leads to U[t) := being a 
unitary operator, an operator which preserves the norm on the Hilbert space (“e“‘^^ is just 
a phase”). Lastly, states will be closely related to (orthogonal) projections P which satisfy 
p2 = P. 

This chapter will give a zoology of operators and expound on three particularly important 
classes of operator: selfadjoint operators, orthogonal projections and unitary operators as 
well as their relations. Given the brevity, much of what we do will not be rigorous. In fact, 
some of these results (e. g. Stone’s theorem) require extensive preparation until one can 
understand all these facets. For us, the important aspect is to elucidate the connections 
between these fundamental results and PDFs. 

5.1 Bounded operators 

The simplest operators are bounded operators. 

Definition 5.1.1 (Bounded operator) Let X and y be normed spaces. A linear operator 
T : X —> y is called bounded if there exists M > 0 with HTxH^; < M ||x||;j;’. 

Just as in the case of linear functionals, we have 

Theorem 5.1.2 Let T : X —> y be a linear operator between two normed spaces X and y. 
Then the following statements are equivalent: 

(i) T is continuous at Xg e X. 

(ii) T is continuous. 

(in) T is bounded. 

Proof We leave it to the reader to modify the proof of Theorem 4.3.2. □ 

We can introduce a norm on the operators which leads to a natural notion of convergence: 

Definition 5.1.3 (Operator norm) Let T : fT —> y bea bounded linear operator between 
normed spaces. Then we define the operator norm of T as 

||T|| := sup llTxll^;. 
lbll=i 

The space of all bounded linear operators between X and y is denoted by B(X, (V). 
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One can show that ||r|| coincides with 

inf{M > 0 I llTxlly < M ||x||;t. Vx e A’} = ||r||. 

The product of two bounded operators T e B[y, Z) and S e B[X, 3^) is again a bounded 
operator and its norm can be estimated from above by 

l|rs||<||r||||s||. 

If y = X = Z, this implies that the product is jointly continuous with respect to the norm 
topology on A”. For Hilbert spaces, the following useful theorem holds: 

Theorem 5.1.4 (Hellinger-Toeplitz) Let Abe a linear operator on a Hilbert space H with 
dense domain VfA) such that {xlt,Aip) = {Axlt,ip) holds for all ip,%p e ThenDfA) = % 
if and only if A is bounded. 

Proof <J=: If A is bounded, then ||A(p|| < Af||(p|| for some M > 0 and all ip & H by 
definition of the norm. Hence, the domain of A is all of "H. 

=>: This direction relies on a rather deep result of functional analysis, the so-called Open 
Mapping Theorem and its corollary, the Closed Graph Theorem. The interested reader 
may look it up in Chapter III.5 of [RS72]. □ 

Let T,S be bounded linear operators between the normed spaces A” and y. If we define 

(T -FS)x := Tx + Sx 


as addition and 


(A - T)x := PiTx 

as scalar multiplication, the set of bounded linear operators forms a vector space. 

Proposition 5.1.5 The vector space BiX,y) of bounded linear operators between normed 
spaces X and y with operator norm forms a normed space. If y is complete, B[X, y) is a 
Banach space. 

Proof The fact B{X,y) is a normed vector space follows directly from the definition. 
To show that B{X, Jl) is a Banach space whenever y is, one has to modify the proof of 
Theorem 4.3.4 to suit the current setting. This is left as an exercise. □ 

Very often, it is easy to define an operator T on a “nice” dense subset V Q X. Then the 
next theorem tells us that if the operator is bounded, there is a unique bounded extension 
of the operator to the whole space X. For instance, this allows us to instantly extend the 
Fourier transform from Schwartz functions to L^(R") functions (see Proposition 6.2.14). 


65 





5 Linear operators 


Theorem 5.1.6 Let T) Q X be a dense subset of a normed space and y be a Banach space. 
Furthermore, let T : T> —> y be a bounded linear operator. Then there exists a unique 
bounded linear extension f : X —> y and IlfH = ||r||p. 

Proof We construct f explicitly: let x & X be arbitrary. Since V is dense in X, there 
exists a sequence in V which converges to x. Then we set 

fx:= lim rx„. 

n—*C)0 


First of all, f is linear. It is also well-defined: (Tx„)ngf^ is a Cauchy sequence in y, 

||rx„ - rx;t||j; < lir||x)lk„-x^||;t> 0, 

where the norm of T is defined as 


\\T\\v := sup 

jce-D\{ 0 } 


llrxilj; 
IklU • 


This Cauchy sequence in y converges to some unique y e 3^ as the target space is com¬ 
plete. Let (x'be a second sequence in V that converges to x and assume the sequence 
(Tx^)„gp^ converges to some y' e y. We define a third sequence (Zn)ngp^ which alternates 
between elements of the first sequence (Xn)„gi^ and the second sequence (x')ngf^, i. e. 


Z 2 n -1 ■= Xn 
Z2n := < 

for all n e N. Then also converges to x and (Tz„) forms a Cauchy sequence that 

converges to, say, L, ^y. Subsequences of convergent sequences are also convergent and 
they must converge to the same limit point. Hence, we conclude that 


^ = lim rz„ = lim T Z 2 „ = lim Tx„= y 

n—*co n—*co n—*co 

= lim Tz 2 n-i = lim Tx' =y' 

n—*oo n—*oo 


holds and f x does not depend on the particular choice of sequence which approximates 
X in V. It remains to show that || f || = || Tllp: we can calculate the norm of f on the dense 
subset V and use that f\xi = T to obtain 


Ilf 


sup ||fx|| 

X€A’ 

lk|| = l 


sup 


\\fx\\ 

Ikll 


sup 

xeV\{0] 


llrxll 

Ikll 


\\V • 


sup 

x€X)\{ 0 } 


\\fx\\ 

Ikll 


Hence, the norm of the extension f is equal to the norm of the original operator T. 
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The spectrum of an operator is related to the set of possible outcomes of measurements 
in quantum mechanics. 

Definition 5.1.7 (Spectrum) Let T e B[X) be a bounded linear operator on a Banach 
space X. We define: 

(i) The resolvent of T is the set p[T) := {z ^ C \ T — zid is bijective}. 

(ii) The spectrum cj(r) := C \ p(T) is the complement of p[T) in C. 

(in) The set of all eigenvalues is called point spectrum 

o-p(r) := {z e C I T — z id is not injective}. 

(iv) The continuous spectrum is defined as 

o'^(T) := {z e C I T —zid is injective, im(r — zid) C W dense}. 

(v) The remainder of the spectrum is called residual spectrum, 

cjf(r) := {z e C I T — z id is injective, im(r — z id) C W not dense} 

One can show that for all A e p(T), the map (T — z id)“^ is a bounded operator and the 
spectrum is a closed subset of C. One can show its <j(T) is compact and contained in 
{AeC| |A|<||r||}cC. 

Example (Spectrum ofH = —2^) (i) The spectrum of —2^ on L^(R) is a(—2^) = 

u^(—2^) = [0, oo); it is purely continuous. 

(ii) Restricting —2^ to a bounded domain (e. g. an interval [a, b]) turns out to be purely 
discrete, <j(-2^) = ap(-2^). 

Note that —2^ need not have point spectrum. If an operator H on a Banach space X has 
continuous spectrum, then there exist no eigenvectors in X. For instance, the eigenfunc¬ 
tions of —5,^ to A^ 7 ^ 0 are of the form However, none of these plane waves is 

square integrable on R, = oo. 


5.2 Adjoint operator 

If W is a normed space, then we have defined X', the space of bounded linear functionals 
on W. If T : W —> 3^ is a bounded linear operator between two normed spaces, it 
naturally defines the adjoint operator T': y' —> X' via 

(T'L)(x):=L(Tx) (5.2.1) 
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for all X & X and L & y'. In case of Hilbert spaces, one can associate the Hilbert space 
adjoint. We will almost exclusively work with the latter and thus drop “Hilbert space” 
most of the time. 

Definition 5.2.1 (Adjoint and selfadjoint operator) Let 'LL be a Hilbert space and A e 
B(LL) be a bounded linear operator on the Hilbert space Li. Then for any ip ^ LL, the 
equation 

Vi/)eX>(A) 

defines a vector (p. For each ip & LL, we set A*p := <p and A* is called the (Hilbert space) 
adjoint of A. In case A* =A, the operator is called selfadjoint. 

Hilbert and Banach space adjoint are related through the map Cip := {ip, ■ ) = because 
then the Hilbert space adjoint is defined as 

A* := C“WC. 

Example (Adjoint of the time-evolution group) (e““^)* = e+‘^^ = e^'^^ 

Proposition 5.2.2 Let A,B & B(L{) be two bounded linear operators on a Hilbert space Li 
and a e C. Then, we have: 

(i) (A + By=A*+B* 

(ii) (aAy = a*A* 

(in) (ABy = B*A* 

(iv) ||A*|| = ||A|| 

(v) A**=A 

(Vi) i|A*A|| = ||AA*|| = ||A||2 

(vii) kerA = (imA*)-*-, kerA* = (imA)-*- 

Proof Properties (i)-(iii) follow directly from the defintion. 

To show (iv), we note that ||A|| < ||A*|| follows from 

= sup |(A*tj)i,(p)| < ||a*|| ||<p|| 

llw||=i 
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where in the step marked with we have used that we can calculate the norm from 
picking the functional associated to for a functional with norm 1, ||L||^, = 1, the 
norm of L^A^p) cannot exceed that oiAip 

\L(Ap>)\ = \{xpi^,Ap>)\ < ||'0lI|||A(pI| = ||A(p||. 

Here, xpi is the vector such that L = (xpi, ■) which exists by Theorem 4.3.5. This theorem 
also ensures ||L||^, = ||'i/)^||. On the other hand, from 

IK^lII = = sup \{A*1pi^,ip)\ 

IMI=i 

< sup ||t/.,||||A(p|| = ||A||||L||,= ||A||||t/>J| 

lk||=i 


we conclude ||i4*|| < ||i4||. Hence, ||A*|| = ||i4||. 

(v) is clear. For (vi), we remark 

||A||^= sup = sup {(p,A*Aip) 

lkll=i lkll=i 

< sup |U*Aip|| = . 

Ikll=l 


This means 


\\A\f<\\A*A\\<\\A*\\\\A\\ = \\Af- 

which combined with (iv), 

I|a||2 = \\A*f < \\aa*\\ < ||A|| \\a*\\ = 

implies ||AM|| = ||i4||^ = l|A4*||. (vii) is left as an exercise. □ 

Definition 5.2.3 Let l-Lhe a Hilbert space and A e B(H). Then A is called 

(i) selfadjoint (or hermitian) if A* =A, 

(ii) unitary if A* =Ar^, 

(in) an orthogonal projection ifA^ = A and /T = A, and 

(iv) positive semidefinite (or non-negative) iff {ip,Aif) > 0 for all & TL and positive 
(definite) if the inequality is strict. 

This leads to a particular characterization of the spectrum as a set [RS72, Theorem VII.12]: 
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2013.10.24 


Theorem 5.2.4 (Weyl’s criterion) Let H be a bounded selfadjoint operator on a Hilbert 
space H. Then X e cr(H) holds if and only if there exists a sequence {ipnjnem W'H’nW — 1 

and 


= 0 . 

Example (Weyl’s criterion for H = —d^ on L^(R)) For any A e R \ {0}, one can choose 
a sequence {ipn}n€N of normalized and cut off plane waves To make sure they are 

normalized, we know that pointwise ^ 0 as n ^ oo. 


5.3 Unitary operators 

Unitary operators U have the nice property that 

{Uip,Uip) = (v3,U*Ui/>) = = (</?,■(/>) 

for all I/?,!/) e "H. In case of quantum mechanics, we are interested in solutions to the 
Schrodinger equation 

at 

for a hamilton operator which satisfies H* = H. Assume that H is bounded (this is really 
the case for many simple quantum systems). Then the unitary group generated by H, 

U[t) = e-‘'«. 


can be written as a power series. 


00 2 


where := id by convention. The sequence of partial sums converges in the operator 




norm to e 

n 

n=0 

since we can make the simple estimate 

00 2 00 2 00 2 

S ^ I] m I'l" ^ 2 m I'l" 


n=0 


n=0 


n=0 


= elt|||H|| 


< 00 . 
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This shows that the power series of the exponential converges in the operator norm inde¬ 
pendently of the choice of i/) to a bounded operator. Given a unitary evolution group, it 
is suggestive to obtain the hamiltonian which generates it by deriving U[t)ip with respect 
to time. This is indeed the correct idea. The left-hand side of the Schrodinger equation 
(modulo a factor of i) can be expressed as a limit 

—i/^Ct) = lim I (t/)(t + 5)- ip[t)). 

Qt o^O ^ 

This limit really exists, but before we compute it, we note that since 

xp[t + 5)- %p[t) = -l)xpo, 

it suffices to consider differentiability at t = 0: taking limits in norm of "H, we get 

d 1 f ^ (—0" A 

= hm - xPo) = lim - g J 

n=l 

Hence, we have established that e“‘^^'!/>o solves the Schrodinger condition with i/’CO) = 
'H’o, 

at 

However, this procedure does not work if H is unbounded (i. e. the generic case)! Before 
we proceed, we need to introduce several different notions of convergence of sequences 
of operators which are necessary to define derivatives of U[t). 

Definition 5.3.1 (Convergence of operators) Let beasequence of bounded 

operators. We say that the sequence converges to A& B{T-C) 

(i) uniformly/in norm iflimn_,oo||An — a|| = 0. 

(ii) strongly iflimn_,oo||A„'i/i — = 0/or all t/ e "H. 

(in) weakly if (if —Atp) = Ofor all ip, ip & H. 

Convergence of a sequence of operators in norm implies strong and weak convergence, 
but not the other way around. In the tutorials, we will also show explicitly that weak 
convergence does not necessarily imply strong convergence. 

Example With the arguments above, we have shown that if H = H* is selfadjoint and 
bounded, then t e“‘^^ is uniformly continuous. 
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If ||H|| = 00 on the other hand, uniform continuity is too strong a requirement. If H = 
— is the free Schrodinger operator on then the Fourier transform T links the 

position representation on L^CR") to the momentum representation on L^(R"). In this 
representation, the free Schrodinger operator H simplifies to the multiplication operator 

h=iF 

acting on L^(R"). This operator is not bounded since sup^gjj„ = oo (cf problem 24). 
More elaborate mathematical arguments show that for any t e R, the norm of the differ¬ 
ence between U[t) = and 1/(0) = id 

||&(t)-id|| = sup|e“‘^2'^' - l| = 2 

teR" 


is exactly 2 and t/(t) cannot be uniformly continuous in t. However, if e L^(R") is a 
wave function, the estimate 


\\u[t)'ip-'$f = I dfc|e - l|^ |t/^(fc)|" 


< 2 ^ 


shows we can invoke the Theorem of Dominated Convergence to conclude U[t) is strongly 
continuous in t e R. 


Definition 5.3.2 (Strongly continuous one-parameter unitary evolution group) A fam¬ 
ily of unitary operators on a Hilbert space H is called a strongly continuous one- 

parameter unitary group - or unitary evolution group for short - if 

(i) t <—>■ U[t) is strongly continuous and 

(ii) U[t)U[t') = U[t -F tO as well as 17(0) = id^ 
hold for all t, t' e R. 

This is again a group representation o/R just as in the case of the classical flow 4>. The 
form of the Schrodinger equation, 

at 

also suggests that strong continuity/differentiability is the correct notion. Let us once 
more consider the free hamiltonian H = — on L^(R"). We will show that its domain 
is 


X>(H) = {(p e L^CR") I - e L^CR")}. 
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In Chapter 7, we will learn that I?(H) is mapped by the Fourier transform onto 
V[H) = l 2(R") I e L^CR")}. 


Dominated Convergence can once more be used to make the following claims rigorous: 
for any tp e we have 


lim 

t^O 


-^{U[t)-id)'ip-^Pxp <liin ^(i/(t)-id) 


(5.3.1) 


The second term is finite since xp e V^H) and we have to focus on the first term. On the 
level of functions, 


lim - (e 
t-.o t'' 


-itiU 




-itiu 


= 


17,2 


holds pointwise. Furthermore, by the mean value theorem, for any finite t e R with 

|t| < 1, for instance, then there exists 0 < tg < t such that 2013.10.29 


1 

t 




1 ) = d,e 




t=t„ 


—i^fc^e 


This can be bounded uniformly in t by Thus, also the first term can be bounded 
by uniformly. By Dominated Convergence, we can interchange the limit t ^ 0 

and integration with respect to fc on the left-hand side of equation (5.3.1) . But then the 
integrand is zero and thus the domain where the free evolution group is differentiable 
coincides with the domain of the Fourier transformed hamiltonian. 


lim 

t->0 


)(L/(t)-id)t/)- iFt/i 


= 0 . 


This suggests to use the following definition: 


Definition 5.3.3 (Generator of a unitary group) densely defined linear operator on a 
Hilbert space H with domain I?(H) ^ H is called generator of a unitary evolution group 
[/(t), t e R, ij 


(i) the domain coincides with 

I?(H) = e K I t U(t)if dijferentiable^ = 'D(H) 

(ii) and for all xp e 'D[H), the Schrodinger equation holds, 

d 

i—U[t)xp=HU[t)xp. 

at 
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This is only one of the two implications: usually we are given a hamiltonian H and we 
would like to know under which circumstances this operator generates a unitary evolution 
group. We will answer this question conclusively in the next section with Stone’s Theorem. 


Theorem 5.3.4 Let H be the generator of a strongly continuous evolution group U[t\ t e R. 
Then the following holds: 

(i) T){H) is invariant under the action of U[t), i. e. !7(t)I?(H) = 'D[H)for all t e R. 

(ii) H commutes with U[t\ i. e. [U[t),H]'ip := U[t)H%p — H U[t)xp = Ofor uZZ t e R and 
tp e X>(H). 

(Hi) H is symmetric, i. e. {Hip,ip) = [ip,Hip) holds for all ip,ip & 

(iv) U(t) is uniquely determined by H. 

(v) H is uniquely determined by U(t). 


Proof (i) Let ip e T>(H). To show that U(t)ip is still in the domain, we have to show 
that the norm of HU(t)ip is finite. Since H is the generator of U(t), it is equal to 

= lim ^(Z7(s) — id)ip. 

s=0 

Let us start with s > 0 and omit the limit. Then 


Hip = i— U(s)ip 
as 


l{U(s)-id)U(t)iP 


U(t)l{U(s)-id)ip 


^-{U(s)-id)ip 


< 00 


holds for all s > 0. Taking the limit on left and right-hand side yields that we can 
estimate the norm oi HU(t)ip by the norm of Hip - which is finite since ip is in the 
domain. This means U(t)'D(H) c 'D(H). To show the converse, we repeat the proof 
for U(—t) = U(t)~^ = Z7(t)* to obtain 


V(H) = U(-t)U(t)V(H) c U(t)V(H). 


Hence, U(t)V(H) = V(H). 

(ii) This follows from an extension of the proof of (i): since the domain 'D(H) coincides 
with the set of vectors on which U(t) is strongly differentiable and is left invariant 
by Z7(t), taking limits on left- and right-hand side of 


i ([/(s) - id) u(t)ip - u(ty- ([/(s) - id)ip 


= 0 


leads to [H, U(t)]ip = 0. 
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(iii) This follows from differentiating {U(t)(f,U[t)\p) for arbitrary V^H) and 

using [!7(t),H] = 0 as well as the unitarity of U[t) for all t e R. 


(iv) Assume that both unitary evolution groups, !7(t) and U[t), have H as their genera¬ 
tor. For any ip e V[H), we can calculate the time derivative of ||(!7(t) — [/(t))!/)!! , 


d 

dt 


2 


(U(t)-!7(t))t/) 


= 2^(|kir-Re([/(t)t/>,[/(t)t/))) 

= -2Re {^{-iHU[t)xp, UiO-ip) + {U[t)xp,-iHU[t)'ip)^ 


= 0 . 


Since !7(0) = id = 1/(0), this means !7(t) and U[t) agree at least on I?(H). Using 
the fact that there is only bounded extension of a bounded operator to all of Ti, 
Theorem 5.1.6, we conclude they must be equal on all of V.. 

(v) This follows from the definition of the generator and the density of the domain. □ 


Now that we have collected a few facts on unitary evolution groups, one could think that 
symmetric operators generate evolution groups, but this is false! The standard example 
to showcase this fact is the group of translations on L^([0,1]). Since we would like T(t) 
to conserve “mass” - or more accurately, probability, we define for (p e L^([ 0 , 1 ]) and 
0 < t < 1 


iT[t)ip)[x) := 


(p(x — t) X — t e [ 0 , 1 ] 

(p(x - t-F 1 ) X - t-F 1 e [ 0 , 1 ] ■ 


For all other t e R, we extend this operator periodically, i. e. we plug in the fractional 
part of t. Clearly, (T(t)(p, T(t)t/)) = {^p,ip) holds for all ip,ip e L^([0,1]). Locally, the 
infinitesimal generator is —id^ as a simple calculation shows: 


)(Jt) 




-id^ipM 


However, T(t) does not preserve the maximal domain of —id^, 

2?max(-i5x) = L\[0, 1]) I - i5,(p e L\[0, 1])}. 


Any element of the maximal domain has a continuous representative, but if (p(0) 7 ^ (p(l), 
then for t > 0, T(t)<p will have a discontinuity at t. We will denote the operator —id^ 
on with Pmax- Let us check whether is symmetric: for any (p,ip & 


75 










5 Linear operators 


we compute 

((p,-ia^t/)) = f dx(p*(x)( 

Jo 

= i((p*(0)t/)(0)- 
= i((p*(0)t/)(0)- 

In general, the boundary terms do not disappear and the maximal domain is “too large” 
for —i3jf to be symmetric. Thus, it is not at all surprising, T[t) does not leave 
invariant. Let us try another domain: one way to make the boundary terms disappear is 
to choose 

:= e L\[0, 1]) I - e L\[0, 1]), (p(0) = 0 = (p(l)}. 

We denote —id^ on this “minimal” domain with In this case, the boundary terms in 
equation (5.3.2) vanish which tells us that is symmetric. Alas, the domain is still not 
invariant under translations T(t), even though Pj^jn is symmetric. This is an example of a 
symmetric operator which does not generate a unitary group. 

There is another thing we have missed so far: the translations allow for an additional 
phase factor, i. e. for ip, ■!/> e L^([0,1]) and t? e [0, 27 t), we define for 0 < t < 1 

tT V t x-te[0,l] 

(T#(t)(p)(x). x-t + l&[0,l]' 

while for all other t, we plug in the fractional part of t. The additional phase factor 
cancels in the inner product, {T^[t)^p,T^[t)ll)) = (<(’,'</’) still holds true for all {p,tp e 
L^([0,1]). In general T^Ct) ^ if ^ and the unitary groups are genuinely 

different. Repeating the simple calculation from before, we see that the local generator 
still is —i3^ and it would seem we can generate a family of unitary evolutions from a 
single generator. The confusion is resolved if we focus on invariant domains: choosing 
•& e [0,27r), we define P.^ to be the operator —id^ on the domain 

V^[-idJ := {(p e L\[0, 1]) | - i3,<p e L\[Q, 1]), <p(0) = e-‘^(p(l)}. 

A quick look at equation (5.3.2) reassures us that P.jj is symmetric and a quick calculation 
shows it is also invariant under the action of T^Ct). Hence, P.^ is the generator of T^, and 
the definition of an unbounded operator is incomplete without spelling out its domain. 

Example (The wave equation with boundary conditions) Another example where the 
domain is crucial in the properties is the wave equation on [0, L], 

d^u[x,t) — d^u[x,t) = 0 , ueC^([0,L] x R). 


-id^'4}Xx)= |^-i(p*(x)t/)(x)j^- f dx(-i)a^(p*(x)t/)(x) 


ri 




dx(-ia^(p)*(x)t/)(x) 


■ VJ*(l)t/)(l)) + {-id^if,'ip). 


(5.3.2) 
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Here, u is the amplitude of the vibration, i. e. the lateral deflection. If we choose Dirichlet 
boundary conditions at both ends, i. e. u(0) = 0 = u(L), we model a closed pipe, if we 
choose Dirichlet boundary conditions on one end, u(0) = 0, and von Neumann boundary 
conditions on the other, u'(L) = 0, we model a half-closed pipe. Choosing domains is a 
question of physics! 


5.4 Selfadjoint operators 


Although we do not have time to explore this very far, the crucial difference between 
and Pfl is that the former is only symmetric while the latter is also selfadjoint. We first 
recall the definition of the adjoint of a possibly unbounded operator: 

Definition 5.4.1 (Adjoint operator) Let A be a densely defined linear operator on a Hilbert 
space H with domain V(A). Let T>(A*) be the set of if ^ Hfor which there exists <p with 

{Alp, if) = {ip, (f)) 'iip&ViA). 

For each if e VIA*), we define A* if := (p and A* is called the adjoint of A 

Remark 5.4.2 By Riesz Lemma, if belongs to ViA*) if and only if 

|(Atj),ip)| < C||t/)|| Mip&V{A). 

This is equivalent to saying if e I?(A*) if and only ii ip {Aip, if) is continuous on I?(A). 
As a matter of fact, we could have used to latter to define the adjoint operator. 


One word of caution: even if A is densely defined. A* need not be. 
Example Let / e L“(R), but / ^ L^(R), and pick ipQ e L^(R). Define 


VlfTf) ■.= ■ Ip & L^(R) I dx |/(x)i/)(x)| < ooj. 


J 

Then the adjoint of the operator 


Tf'>p:={f,'ip)tpo, ip&V{Tf), 

has domain = {0}. Let ip e Then for any f e V[Tp 

{Tfip,if) = {{f,ip)ipo,if) = {ip,f) {^Po,f) 

= ¥’>/)• 

Hence Tjf = {ipo, if)f ■ However / ^ L^(R) and thus if = 0 is the only possible choice 
for which ip is well defined. 
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2013.10.31 


Symmetric operators, however, are special: since {H>p,'ip) = {ip,Hip) holds by definition 
for all (p, t/j e H, the domain of H* is contained in that of H, ViH*) 2 In particular, 

c is also dense. Thus, H* is an extension of H. 

Definition 5.4.3 (Selfadjoint operator) Let Hbea symmetric operator on a Hilbert space 
Li with domain 'D(H). H is called selfadjoint, H* = H, iffD^H*) = 

One word regarding notation: if we write H* = H, we do not just imply that the “operator 
prescription” of H and H* is the same, but that the domains of both coincide. 

Example In this sense, ^ 

The central theorem of this section is Stone’s Theorem: 

Theorem 5.4.4 (Stone) To every strongly continuous one-parameter unitary group U on a 
Hilbert space Li, there exists a selfadjoint operator H = H* which generates U[tj = 
Conversely, every selfadjoint operator H generates the unitary evolution group U[tj = 

A complete proof [RS72, Chapter VIII.3] is beyond our capabilities. 


5.5 Recasting the Maxwell equations as a Schrodinger 
equation 

The Maxwell equations in a medium with electric permittivity e and magnetic permeability 
p are given by the two dynamical 

5fE(t) = +£:“iV^ xH(t) (5.5.1a) 

3tH(t) = -/i“iV^ xE(t) (5.5.1b) 

and the two kinetic Maxwell equations 

• eE(t) = p (5.5.2a) 

V^-pH(t) = ;. (5.5.2b) 


Here, the source terms in the kinetic equations are the charge density p and the current 
density j; in the absence of sources, p = 0 and j = 0, the Maxwell equations are homoge¬ 
neous. 

We can rewrite the dynamical equations (5.5.1) as a Schrodinger-type equation. 


■A [ 

'dt [mjj 


= M{e,p) 





(5.5.3) 
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where the Maxwell operator 


M(e,/x):=WRot:= 





(5.5.4) 


takes the role of the Schrodinger operator H = — + V. It can be conveniently written 

as the product of the multiplication operator W which contains the material weights e 
and p, and the/fee Maxwell operator Rot. Here, is just a short-hand for the curl, 
V^E ;= X E. The solution can now be expressed just like in the Schrodinger case. 



— g-itiUCe.M) 



where the initial conditions must satisfy the no sources condition (equations (5.5.2) for 
p = 0 and j = 0); one can show that this is enough to ensure that also the time-evolved 
fields E(t) and H(t) satisfy the no sources conditions for all times. 

Physically, the condition that E and H be square-integrable stems from the requirement 
that the field energy 


£:(E,H) := 


dx (^e(x) |E(x)p -I- p(x) |H(x)p j 


(5.5.5) 


be finite; Moreover, the field energy is a conserved quantity, 


£:(E(t),H(t)) =£’(E™,H™). 


It is not coincidental that the expression for £ looks like the square of a weighted L^- 
norm: if we assume that e, p e are bounded away from 0 and -l-oo, i. e. there 

exist c, C > 0 for which 


0 < c < e(x),p(x) < C < - 1-00 

holds almost everywhere in x e then e~^ and are also bounded away from 0 and 
-1-00 in the above sense. Hence, we deduce 

= 4'eH(e,p):=L2(R3,C^)eL2(R3,C^) 

where L^(R^,C^) and L^(R^,C^) are defined analogously to problem 22. By definition, 4^ 
is an element of 91(8, p,') if and only if the norm induced by the weighted scalar 

product 

(5.5.6) 

^ r 

:= dx e[x) [x) ■ [x) + dx p[x) [x) ■ [x) 

Jr^ Jr^ 
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is finite. Adapting the arguments from problem 22, we conclude that l2(]r 3,C®) and 
can be canonically identified as Banach spaces. Now the field energy can be 
expressed as 

£(E,H)= 1 := i((E,H),(E,H))„(,,,3, 

and the conservation of field energy suggests that is unitary with respect to the 

weighted scalar product (•, ■)«(£,m)- 

Indeed, this is the case: the scalar product can alternatively be expressed in terms of 
W~^ and the unweighted scalar product on 


(5.5.7) 

The last equality holds true, because are multiplication operators with scalar real¬ 
valued functions in the electric and magnetic component, and thus 

holds for all ip^yCj)^ e L^(R^,C^), for instance. Under the assumption that the free 
Maxwell Rot is selfadjoint on C®), then one can also show the selfadjointness of the 

Maxwell operator M{e,p) = W Rot by using its product structure and equation (5.5.7) : 


(^,M(e,M)^)w(,,^)= (^,fV^fVRot4>)i2(R3^cU= 

= (RotT',$)^2(R3,cU= 

= (M(e,/t)T',W-i$)^3(R3c6) = 

These arguments imply that is unitary with respect to {■ ,-)'Hie,fj.p ^^d thus, we 

deduce that the dynamics conserve energy. 


£'(E(t),H(t)) 


I'gCO) 


(e(0), h( 0) j 


2 


= £(E™,Hm). 


Moreover, the formulation of the Maxwell equations as a Schrddinger equation also allows 
us to prove that the dynamics j^^p real-valued fields onto real-valued fields: 

define complex conjugation (CT')(x) := T'(x) on 'H{e,p) component-wise. Then the fact 
that e and p are real-valued implies C commutes with W, 


(CWT'jCx) = (V7(')/'^,'i/j^))(x) = (e ^[x)xp^[x),p ^(x)<j)^(x)) 
= = (V7CT')(x) 
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5.5 Recasting the Maxwell equations as a Schrodinger equation 


In problem 23, we have shown that C Rot C = —Rot, and thus 

CM(e,/t)C = CWRotC = WCRotC 
= -WRot= -M(e,|U) 

holds just as in the case of the free Maxwell equations. This means the unitary evolution 
operator and complex conjugation commute, 


Q g—^ _ g+it CM[£,fi) C _ g—itM(e,|U) 


as does the real part operator Re := | (id^j^ + C), 



Now if the initial state = Re is real-valued, then so is the time- 

evolved state, 


(E(t),H(t)) = (E™,h(°^) 


= Re (E*^°^, 

= Re (E*^°^, 

= Re (E(t),H(t)). 


The reformulation of the Maxwell equations as a Schrddinger-type equation was first made 
rigorous by [BS87]; it allows to adapt and apply many of the techniques first developed for 
the analysis Schrodinger operators to Maxwell operators, e. g. under suitable conditions 


one can derive ray optics as a “semiclassical limit” of the above equations. 


2013.11.05 
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Chapter 6 

The Fourier transform 


The wave equation, the free Schrodinger equation and the heat equation all admit the 
same class of “fundamental solutions”, namely exponential functions In some cases, 

the boundary conditions impose restrictions on the admissible values of 

This is because these equations share a common symmetry, namely invariance under 
translations (we will be more precise in Chapters 6.1.5 and 6.2.4-6.2.5), and the Fourier 
transform converts a PDF into an ODE. Moreover, certain properties of the function are 
tied to certain properties of the Fourier transform, the most famous being that the regu¬ 
larity of / is linked to the decay rate of iff. 


6.1 The Fourier transform on T" 


Let us consider the Fourier transform on the torus T" := which will be identified 
with [—71, -Fti]". Moreover, we will view functions on T" with 27rZ"-periodic functions on 
R" whenever convenient. Now we proceed to define the central notion of this section: 

Definition 6.1.1 (Fourier transform on T") For all f e L^(T"), we set 


:=/(fc):= 


1 

(271)" 


dxe 

Jt" 


for k e Z". The Fourier series is the formal sum 

( 6 . 1 . 1 ) 


If all we know is that / is integrable, then the question on whether the Fourier series 
exists turns out to be surprisingly hard. In fact, Kolmogorov has shown that there exist 


83 







6 The Fourier transform 


integrable functions for which the Fourier series diverges for almost all x e T". However, 
if we have additional information on/, e. g. if / e for r > n + 1, then we can show 

that (6.1.1) exists as an absolutely convergent sum. 

Example To compute the Fourier coefficients for/(x) = x e L^([— ti,+ 7r]), we need to 
distinguish the cases fc = 0, 


1 

^ +71 

r 1 9I 

— 

dxx = 

— x^ 

2n 

J 

— 71 

471 


+n 


= 0 , 


and fc 7 ^ 0, 


(* +71 


(J-x)(fc) = — 
Zn 


dxe 


t „-ikx 

+ 71 

If 

, JC c 

2nk 

— 71 

fcj- 


dxe 


—ikx 




Thus, the Fourier coefficients decay like '^l\k\ for large |fc|. 


(J-x)(fc) 


0 fc = 0 

(-1)4 fcGZ\{0}' 


We will see later on that this is because /(x) = x has a discontinuity at x = +7r (which is 
identified with the point x = — tr). 

Before we continue, it is useful to introduce multiindex notation: for any a e Nq, we set 


gaf gai,,,ga„r 

X J Xi Xn 

and similarly x“ := x“‘ ■ • -x"”. The integer |a| := Xij=i the degree of a. 


6.1.1 Fundamental properties 

First, we will enumerate various fundamental properties of the Fourier transform on T": 
Proposition 6.1.2 (Fundamental properties of J^) Let f e Lt(T"). 

(i) —>£“(2") 

(ii) iF is linear. 
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(Hi) J-/(fc) = (J-/X-fc) 

(iv) (J-/(--))(fc) = (^/X-fc) 

(v) (j^(Tyf)) = (Tf)(k) where (Tyf)(x) := f(x - y)for y e T" 

(Vi) (Ff)(k - j) = (J-(e+y-/))(fci j e Z" 

(vii) For all f e CXT"), we have (J'(3“/))(fc) = fc“ (Tf)(k)for all \a\<r. 

Proof (i) can be deduced from the estimate 

lh/ll,.(i.i=™p|7-/w| 

iceZ" 

- dxle-'^^-^/Cx)! =(27rr"||/||ii(T»). 

^ 'j'n 

(ii)-(vi) follow from direct computation. 

For (vii), we note that continuous functions on T" are also integrable, and thus we see 
that if / e CCT"), then also d^f e L^(T") holds for any |a| < r. This means iF{d"f) 
exists, and we obtain by partial integration 

(-F(5;/))(fc) = ^ f dx e-'^- 3;/(x) 

yZTC) 

= dx(a“e-‘'^-^)/(x) 

= il“l k^(J^f)(k). 

Note that the periodicity of / and its derivatives implies the boundary terms vanish. This 
finishes the proof. □ 

Example (iF representation of heat equation) Let us consider the heat equation 

5fU = A^u 

on [—7T, +7r]". We will see that we can write 

u(t,x)= ^u(t,fc)e+‘*^-^ (6.1.2) 

*:eZ" 

in terms of Fourier coefficients u(t, k). If we assume we can interchange taking derivatives 
and the sum, then this induces an equation involving the coefficients, 

F(d^u) = F(Aj^u) d^u(t,k) =—k^u(t,k). 
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And if the sum (6.1.2) converges to an integrable function u(t), then clearly 

(J'u)(t,fc) = u(t,fc) 


holds. 


We have indicated before that it is not at all clear in what sense the Fourier series (6.1.1) 
exists. The simplest type of convergence is absolute convergence, and here Dominated 
Convergence gives a sufficient condition under which we can interchange taking limits 
and summation. This helps to prove that the Fourier series yields a continuous or C 
function if the Fourier coefficients decay fast enough. 

Lemma 6.1.3 (Dominated convergence for sums) Let e £^(Z") be a sequence of ab¬ 
solutely summable sequences so that the pointwise limits limj_.;^ a^^\k) = a[k) exist. More¬ 
over, assume there exists a non-negative sequence b = ^ so that 

|a^^'^(fc)| < b(fc) 

holds for all fc e Z" and j e N. Then summing over fc e Z" and taking the limit limj_,oo 
commute, i. e. 



and a = (a(fc));tez" ^ 

Proof Let e > 0. Then we deduce from the triangle inequality and < bj^, |a(fc)| < 

b(fc) that 

^ a^^\k) - ^ a(fc) < ^ |a'^''^(fc) - a(fc)| + ^ |a^-'^(fc) - a(fc)| 

fceZ" fceZ” \k\<N |J:|>N 

< 2|a(Jto-a(fc)| + 2 Y, Kfc). 

|it|<N \k\>N 

If we choose iV e Nq large enough, we can estimate the second term independently of j 
and make it less than e/ 2 . The first term is a finite sum converging to 0, and hence, we can 
make it < e /2 if we choose j >K large enough. Hence, 


^ a^^\k) - ^ a(fc) 

JceZ" fceZ” 


< e/2 + e/2 = e 


holds for j > K. Moreover, a = is absolutely summable since |a(fc)| < b(fc) 

and b = (b(fc));tez" f ^Z"). □ 
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Corollary 6.1.4 (Continuity, smoothness and decay properties) (i) Assume the Fourier 
coejficients iff e of f e L^CT") are absolutely summable. Then 

fceZ" 

holds almost everywhere and f has a continuous representative. 

(ii) Assume the Fourier coefficients off e L^CT") are such that i\kf f W) is absolutely 
summable for some s e N. Then 

5“/(x) = 2 il“l fc“ (J^/)(fc)e+‘'^-^ 

iteZ" 

holds almost everywhere for all |a| < s and f has a C^CT") representative. Moreover, 
the Fourier series of d^f, |a| <s, exist as absolutely convergent sums. 

Proof (i) This follows from Dominated Convergence, observing that 

Kfc) - |/(fc)e+‘'^-| = |/(fc)| 

is a summable sequence which dominates each term of the sum on the right-hand 
side of (6.1.1) independently of x e T". 

(ii) For any multiindex a e Ng with |a| < s, we estimate each term in the sum from 
above by/(fc) times 

— |fc“ = |fc“| 

<C|fc|l“l<C|fcr- 




By assumption, (|fcr/(fc));tez” l“l — elements 

of f ^(Z"), and hence, we have found the sum which dominates the right-hand side 
of 


5“/(x) = Y, i'“' 

fceZ" 


for all X e T". Thus, a Dominated Convergence argument implies we can inter¬ 
change differentiation with summation and that the sum depends on x in a contin¬ 
uous fashion. n 


In what follows, we will need a multiplication, the convolution, defined on L^(T") and 
f ^(Z"). The convolution * is intrinsically linked to the Fourier transform: similar to the 
case of R", we define 


(/ * := 


<iyf[x-y)g[y). 

Jt" 
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where we have used the identification between periodic functions on R" and functions on 
T". Moreover, a straightforward modification to the arguments in problem 16 show that 

Ik — II-^IIl'(t") IkILhT")' (6.1.3) 

There is also a convolution on for any two sequences a, b e we set 

(a*b)(fc) := (6.1.4) 

jeZ" 


More careful arguments allow one to generalize the convolution as a map between differ¬ 
ent spaces, e. g. * : x —> L'’(T"). 

The Fourier transform intertwines pointwise multiplication of functions or Fourier coef¬ 
ficients with the convolution, a fact that will be eminently useful in applications. 

Proposition 6.1.5 (i) /, geL^(T") * g) = {2TiY Ff Fg 

(ii) /,geLHT"),/,gefi(Z") ^ J^iFf * Fg) = fMgM 

iteZ” 

Proof (i) The convolution of two L^(T") functions is integrable, and thus, the Fourier 
transform of / * g exists. A quick computation yields the claim: 


(.F(/*g))(fc) = 


dxe-‘'^-"C/*gXx) 


( 271 )" J 

1 

(271)" 

1 

( 271 )" 


T" 


dx 


Tn 

r 


dye-''-/(x-y)g(y) 


T" 

r 


dx 


d3,e-‘''<"-yV(^-y)e-''^-^g(y) 


= i2nriFfXk)iFgXk) 


(ii) By assumption on the Fourier series of / and g, the sequence / * g is absolutely 
summable, and hence 

Xi if * s) (fc) = X - -'■) sU) e+y- 

)ceZ" k,jeir 

X/(fc)e+‘*^"l fX^0-)e+y-" 

iceZ" J yieZ" 

exists for all x e T". We will show later in Theorem 6.1.14 that for almost all x e T", 
the sum S^^ez" /(^) equals /(x) and similarly for g. □ 
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One helpful fact in applications is that the LfCT") spaces are nested (cf. Section 6.2.3): 
Lemma 6.1.6 L^CT") C L^’(T'')/or 1 < p < q < +oo 

That means it suffices to define T on L^CT"); Life on R" is not so simple, though. 

Proof We content ourselves with a sketch: the main idea is to split the integral of / into 
a region where |/1 < 1 and |/1 > 1, and then use the compactness of T". 


6.1.2 Approximating Fourier series by trigonometric poiynomiais 

The idea of the Fourier series is to approximate L^(T") functions by 

Definition 6.1.7 (Trigonometric polynomials) A trigonometric polynomial on T" is a func¬ 
tion of the form 

P(x)= 2a(fc)e+'''-^ 


where {a(fc)};tez is a finitely supported sequence in Z". The degree of P is the largest number 
so that a[k) 7 ^ 0 where k = We denote the set of trigonometric 

polynomials by Pol(T"). 

Writing a function in terms of its Fourier series is initially just an ansatz, i. e. we do not 
know whether the formal sum ^ converges in any meaningful way. In 

some sense, this is akin to approximating functions using the Taylor series: only analytic 
functions can be locally expressed in terms of a power series in x — Xq, but a smooth 
function need not have a convergent or useful Taylor series at a point. 

The situation is similar for Fourier series: we need additional conditions on f to ensure 
that its Fourier series converges in a strong sense (e. g. absolute convergence). However, 
suitable resummations do converge, and one particularly convenient way to approximate 
/ e L^CT") by trigonometric polynomials is to convolve it with an 


Definition 6.1.8 (Approximate identity) An approximate identity or Dirac sequence is 
a family of non-negative functions c L^CT"), Cg > the following two 

properties: 


IrellihT") ^ ^ holds for all e e ( 0 ,Co). 


(ii) For any R> Owe have lim 

e—0 


dx 5g(x) = 1. 


|x|<R 


Sometimes the assumption that the 5^ are non-negative is dropped. One can show that 
Dirac sequences are also named approximate identities, because 


2013.11.07 
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Theorem 6.1.9 Let be an approximate identity. Then for all f e L^CT") we have 


lim 5^* / 


UHt") 


= 0 . 


The proof is somewhat tedious: one needs to localize 5^ close to 0 and away from 0, ap¬ 
proximate / by a step function near x = 0 and use property (ii) of approximate identities 
away from x = 0. The interested reader may look it up in [Gra08, Theorem 1.2.19 (i)]. 

One standard example of an approximate identity in this context is the the Fejer kernel 
which is constructed from the Dirichlet kernel. In one dimension, the Dirichlet kernel is 


^nUi) 


1 


\k\<N 


sin(2iV -I- l)xi 
sinxj 


(6.1.5) 


Now the one-dimensional Fejer kernel is the Cesaro mean of df^, 


ft 


, -TN 


iti 

N + l 


g-t-fai _ 


N + l 


sin(iV + l)xi 


smxi 


The higher-dimensional Dirichlet and Fejer kernels are then defined as 


( 6 . 1 . 6 ) 


and 


DnM :=P]dNUj) 

j=i 


2 N AT 

2 • • • 2 


(.N + L) . 


(6.1.7) 


( 6 . 1 . 8 ) 


(271)" 


s n 


-+i/c-x 


\j=l 

|fc,|<N 


iV-Fl 


K f sin(iV -F l)Xj 


^- 

(27t)" (iV -F 1)" sinXj 


(6.1.9) 


One can see that 


Lemma 6.1.10 (FjvjweN ^ approximate identity. 

Proof Since Fn is the product of one-dimensional Fejer kernels and all the integrals factor, 
it suffices to consider the one-dimensional case: we note that is non-negative. Then 
the fact that dx = 0 for fc e Z" \ {0} and dx e° = 27T implies (i). 

(ii) is equivalent to proving limjv_oo But this follows from writing/^ in 

terms of sines. n 
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Moreover, if we convolve / e L^(T") with Fn, then 

|;c,|<N 

is a trigonometric polynomial of degree N, and thus 
Proposition 6.1.11 Pol(T") is dense in Lf’(T")/or any 1 < p < oo. 

Proof Since Fjv ^n approximate identity, limjv_oo||^'N */ 

trigonometric polynomial Fn * / approximates / arbitrarily well in norm. □ 


1 - 


N + ^ 


/(fc)e 


+ik-x 


6.1.3 Decay properties of Fourier coefficients 

A fundamental question is to ask about the behavior of the Fourier coefficients for large 
|fc|. This is important in many applications, because it may mean that certain bases are 
more efficient than others. The simplest of these criteria is the 

Lemma 6.1.12 (Riemann-Lebesgue lemma) / e L^CT") ^ lim /(fc) = 0 

\k\—*oo 

Proof By Proposition 6.1.11, we can approximate any / e L^(T") arbitrarily well by 
a trigonometric polynomial P, i. e. for any e > 0, there exists a P e Pol(T") so that 
||/ — p||^H-.j,„^ < e. Since the Fourier coefficients of P satisfy = 0 (only 

finitely many are non-zero), this also implies that the Fourier coefficients of / satisfy 

0 < ^ Hm |/(fc)| < ^ Hm ^|/(fc) — P(fc)| + |P(fc)| j <e-F0=£:. 

Given as e can be chosen arbitrarily small, the above in fact implies lim /(fc) = 0. □ 

\k\^oo 

Proposition 6.1.13 iF : is injective, i. e. 

Tf=Tg ^ / = gGLHT"). 

Proof Given that / is linear, it suffices to consider the case / = 0: 

Assume fFf = 0. Then Pjv */ =0. Since {PjvlNeN approximate identity, 
0 = Pjv * / ^ / as iV ^ 00 , i. e. / = 0. 

In case / = 0, also all of the Fourier coefficients vanish, iFf = 0. □ 

Proposition 6.1.14 (Fourier inversion) Suppose f e L^(T") has an absolutely convergent 
Fourier series, i. e. IFf ef^(Z"), 

lk/IU-)=Sltw|<“' 

fceZ" 


91 







6 The Fourier transform 


Then 


( 6 - 1 - 10 ) 

*[eZ" 

holds almost everywhere, and f is almost-everywhere equal to a continuous function. 

Proof Clearly, left- and right-hand side of equation (6.1.10) have the same Fourier co¬ 
efficients, and thus, by Proposition 6.1.13 they are equal as elements of L^(T"). The 
continuity of the right-hand side follows from Corollary 6.1.4 (i). □ 


Theorem 6.1.15 (Regularity / decay J^/) (i) Let s e Ng, 5 e (0,1) and assume 

that the Fourier coefficients of f & L^(T") decay as 

|/(fc)| <C(l + |fc|)-"-^-^ (6.1.11) 


Then f e C^(T"). 

(ii) The Fourier coefficients of f e C*(T") satisfy lim (|fc|''/(fc)) = 0/or r < s. 

|fc |^00 ^ 


(in) f e C“(T") holds if and only if for all r >0 there exists > 0 such that 

|/(fc)|<Q(l + |fc|)-^ 

Proof (i) The decay assumption (6.1.11) ensures that (i'“'fc“/(fc));tgzn is absolutely 
summable if |a| < s, and thus, by Proposition 6.1.2 (vii) and 6.1.14 left- and right- 
hand side of 


a“/(x) = 2 il“l fc“/(fc)e+‘'^-^ 

fceZ" 


2013.11.14 


are equal and continuous in x. 

(ii) Clearly, |fc|'^ < |fc|* holds for r < s and k e Z". Moreover, d^f e L^(T"), |a| < s, and 
thus, the Riemann-Lebesgue lemma 6.1.12 implies lim|^|_oo(|fc|''/(fc)) = 0. 

(iii) “=^:” / e C°°(T"), then (ii) implies that lim|;t|—ood^r/C^)) = 0 holds for all r > 0. 

Conversely, if/(fc) decays faster than any polynomial, then (i) implies for each 
s > 0 the function / is an element of and thus / e C^CT"). 
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6.1.4 The Fourier transform on L^(T") 


We will dedicate a little more time to the example of the free Schrddinger equation on 

T^^[-7t,+nr, 


where we equip A^. := + ... + with periodic boundary conditions. If we can 

show that is an orthonormal basis, then any tp e L^(T") has a unique Fourier 

expansion 

^(x) = 2 T^(fc)e+‘'^-^ (6.1.12) 

tez" 


where the sum converges in the L^-sense and 


-0(fc)= = 


(271)" 


dx e 


—ihx 


t/l(x) 


is the fcth Fourier coefficient. Note that we have normalized the scalar product 


{f>s) 


i2(T") •" 


( 271 )" J 


dx/(x)g(x) 


T" 


so that the have L^-norm 1. Lemma 6.1.6 tells us that xp = Tip is well-defined, 
because ip e L^(T") c L^(T"). Hence, if {e+‘'‘'^};tez" is a basis, any L^(T") function can 
be expressed as a Fourier series. 

Lemma 6.1.16 is an orthonormal basis o/L^(T"). 

Proof The orthonormality of follows from a straight-forward calculation anal¬ 

ogous to the one-dimensional case. The injectivity of if : L^(T") c L^(T") —> £“(Z") 
(Proposition 6.1.13) means ip = 0 & L^(T") if and only if Tip = 0. Hence, {e"'''^ '^};tez" is 
a basis. n 


Proposition 6.1.17 Let f,g e L^(T"). Then the following holds true: 
(i) Parseval’s identity; {f,g)ii(jn^={Tf,Tg)i 2 ^j^n) 


(ii) Plancherel’s identity: ||/||i 2 ('Tn) = ll-F/IIaz") 

(Hi) T : l2(T") —> is a unitary. 

Proof (i) follows from the fact that {e'''‘^'^};tez" is an orthonormal basis and that the 
coefficients of the basis expansion 

{e+'^’^-\iP),2(^^n^ = (TiP)(k) 

coincide with the Fourier coefficients. 

(ii) and (iii) are immediate consequences of (i). □ 
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6 The Fourier transform 


6.1.5 Periodic operators 

An important class of operators are those which commute with translations, 

[H,Ty]=HTy-TyH = Q, Vy e T", 

because this symmetry implies the exponential functions are eigenfunctions of H. 

Theorem 6.1.18 Suppose that H : 1 < p,q < oo, is a bounded linear 

operator which commutes with translations. Then there exists {h(fc)}j.g 2 ^ so that 

(H/)(x) = 2 h(fc)/(fc)e+‘''- (6.1.13) 

holds for all f eC“(T"). Moreover, we have || llBap(T"Ui(T"))- 

An important example class of examples are differential operators, 

H:=J]l3ia)df, ^(u) e C, 

aeNJ* 

|a|<N 


whose eigenvalues are 


He+ik-x ^ ^ /^(fc)il“l j = hik)e+''^-\ 

\a\<N 

The most famous example so far was — whose eigenvalues are k^. 

Proof We already know that the Fourier series of f & C“(T") converges absolutely, and 
its Fourier coefficients decay faster than any power of |fc| (Theorem 6.1.15). So consider 
the functions := fc e Z. The exponential functions are eigenfunctions of the 

translation operator. 


= <p,(x), 

and thus, the fact that T commutes with translations implies 

iTyH^p^fjix) = [Hipt^Xx -y)= {HiTy^p,,))ix) 
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6 .1 The Fourier transform on T' 


Now writing x = y — (y — x) and interchanging the roles of x and y yields that (^(;(x) = 
e+ifcoc jg eigenfunction of H, 

(H(^,)(x) = (H(^,)(x - y + y) = (H(^,)(y) 

=-Kk)TkM- a 

It is easy to see that h(fc) is in fact independent of the choice of y e T". The above also 
means |h(fc)| < ||H||g(iP('Tn)yi(T”)) holds for all fc e Z, and taking the supremum over k 
yields ||h||£»(z) < ||H|| 

B(IP(T"),L’(T"))- 

Hence, equation (6.1.13) holds for all / e C“(T"), e. g. for all trigonometric polynomi¬ 
als. Since those are dense in and T restricted to C°°(T") is bounded, there exists a 

unique bounded extension on all of LP(T’^) (Theorem 5.1.6). 2013.11.19 


6.1.6 Solving quantum problems using fF 

The Fourier transform helps to simplify solving quantum problems. The idea is to convert 
the Hamiltonian to a multiplication operator. We start with a very simple example: 

6.1.6.1 The free Schrodinger equation 

Let us return to the example of the Schrodinger equation: if we denote the free Schodinger 
operator in the position representation with H := —A^ acting on L^CT") and impose 
periodic boundary conditions (cf. the discussion about the shift operator on the interval 
in Chapter 5.3), then the Fourier transform if : L^(T") —> connects position and 

momentum representation, i. e. 


:= jyH = P (6.1.14) 

acts on suitable vectors from by multiplication with P. The names position and 

momentum representation originate from physics, because here, the variable x e T" is 
interpreted as a position while fc e Z" is a momentum. 

To arrive at (6.1.14) , we use that any ip e L^(T") has a Fourier expansion since the 
orthonormal set is also a basis and that the fc Fourier coefficient of 

-A,t/>(x) = - A, Y, = Y fc' 

iceZ" fceZ" 

is just P ipf^k), provided that the sum on the right-hand side is square summable. The 
latter condition just means that —A^^ip must exist in L^CT"). 


95 






6 The Fourier transform 


Clearly, the solution to the free Schrodinger equation in momentum representation is 
the multiplication operator 

and thus we obtain the solution in position representation as well, 

Lr(t) = J--ii7-^(t)J- 

Applied to a vector, this yields 

keir 

Note that this sum exists in L^(T") if and only if the initial state ip is square-integrable. 

6.1.6.2 Tight-binding models in solid state physics 

Quantum mechanical models are encoded via choosing a hamiltonian and a Hilbert space 
on which it acts. In the previous section, we have started in a situation where the position 
variable took values in T", i. e. wave functions were elements of L^CT"). Tight-binding 
models, for instance, are but one example where the position variable is a lattice vector 
and the wave function ip e f ^(Z") a square summable sequence. 

Tight-binding models describe conduction in semiconductors and insulators: here, the 
electron may jump from its current position to neighboring atoms with a certain ampli¬ 
tude. One usually restricts oneself to the case where only the hopping amplitudes to 
nearest and sometimes next-nearest neighbors are included: in many cases, one can prove 
that these hopping amplitudes decay exponentially with the distance, and hence, one only 
needs to include the leading-order terms. 

Single-band model Let us consider a two-dimensional lattice. First of all, we note that 
the number of nearest neighbors actually depends on the crystal structure. For a simple 
square lattice, the number of nearest neighbors is 4 while for a hexagonal lattice, there 
are only 3. Let us start with the square lattice: then the hamiltonian 

H = idj2(z2) -F qiSj -F q2S2 -F qi5* -F q2^*2 — (6.1.15) 

which includes only nearest-neighbor hopping with amplitudes qi,q 2 e C is defined in 
terms of the shift operators 

iSjipXr)-='4>ir-ej), ip&l'^il?). 
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Here, Cj stands for either = (1,0) or 62 = (0,1) and y ^ If one sees 1 / 1 ( 7 ) as the 
Fourier coefficients for 

(J-V)(fc)=Xi^We+'’'^ 

then one can see that the shift operator in momentum representation is just the multipli¬ 
cation operator := iF~^SjiF = : 

(j-is^i/>)(fc) = 2 xp[r-ej)e+^r-k ^ ^ ^+iir+e^yk 

yeZ^ xeZ^ 

= e+‘*^i (J-V)(fc) 

Note that 5 ^ = fF~^z^fF makes sense as a composition of bounded linear operators and 
that 5 ^ : L^(T^) —> L^(T^) is again unitary. 

Hence, the Hamilton operator (6.1.15) in momentum representation transforms to 

= l-F qi e+'^i -F q2 ^ e"'^' + ^ e"'^^ 

= l-F 2Re (qi e+‘^i) -F 2Re (q 2 0 +'^^). 

It turns out that in the absence of magnetic fields, the hopping amplitudes can be chosen to 
be real, and then becomes the multiplication operator associated to the band function 

E[k) = 1 -F 2qi cosfcj -F 2 q 2 cosfc 2 . 

In other words, the Fourier transform converts an operator of shifts into a multiplication 
operator. That means we can solve the Schrodinger equation in momentum representa¬ 
tion, 

i5ti^(t) = H-^i^(t), 1^(0) = i/iq e l2(T^), 

because also the unitary evolution group is just a multiplication operator, 

i7-^(t) = e"‘'^®. 

Moreover, the unitary evolution group associated to the Schrodinger equation in position 
representation 

i3ti/)(t) = Hi/)(t), i/)(0) = ij)o e £^(Z^), 

is obtained by changing back to the position representation with if, 

[/(t) = J-17-^(t)J-^ 

j-l_ 


97 




6 The Fourier transform 



Figure 6.1.1: The honeycomb lattice is the superposition of two triangular lattices where 
the fundamental cell contains two atoms, one black and one white. 


Two-band model The situation is more interesting and more complicated for hexagonal 
lattices: here, there are three nearest neighbors and two atoms per unit cell. The following 
operators have been studied as simplified operators for graphene and boron-nitride (see 
e. g. [HKN-F06; DL13]). Here, the relevant Hilbert space is C^) where the “internal” 
degree of freedom corresponds to the two atoms in the unit cell (black and white atoms 
in Figure 6.1.1). Here, nearest neighbors are atoms of a “different color”, and the relevant 
operator takes the form 


0 lf2(z2) + qiSi-Fq2S2 

I<2(z2)-Fqis*-Fq2S2 0 

and acts on t/) = as 

fu nr ^ - f W + qi - ei) + ^2 - ^2) 

+ ri¥°Kr + ei) + r2¥^Kr + £ 2 ) 

One can again use the Fourier transform to relate H to a matrix-valued multiplication 
operator on L^(T^,C^), namely 





where we have defined 


= 



C7(fc)j 


r(fc). 


CT(fc) = l-Fqje -Fq2e 
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and one can conveniently write T in terms of the Pauli matrices as 
r(fc) = Re (ti7(fc)) CTj + Im (ttrCfc)) CT 2 . 


The advantage is that there exist closed formulas for the eigenvalues 


£±(fc) = ±|c7(fc)| 


of T (fc) which are interpreted as the upper and lower band functions and the two eigen- 
projections 



associated to E±(kf Now if one wants to solve the associated Schrodinger equation in 
momentum representation. 


id,^[t) = t^(0) = 


we can express the unitary evolution group in terms of the eigenvalues and projections as 




Also here, the Fourier transform connects the evolution group in momentum and position 


representation, U{t) = FU^{t)F 


2013.11.21 


6.2 The Fourier transform on 

There also exists a Fourier transform on R" which is defined analogously to Definition 6.1.1. 
In spirit, the L^(R") theory is very similar to that for the discrete Fourier transform. 

6.2.1 The Fourier transform on 

The Fourier transform on R" is first defined on L^CR") as follows: 

Definition 6.2.1 (Fourier transform) For any f e L^(R"), we define its Fourier transform 


(-F/)(?) :=/(?) := I dxe-?-/(x). 


The prefactor (27 t) is a matter of convention. Our choice is motivated by the fact that 
will define a unitary map L^(R") —> L^(R"). 
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6 The Fourier transform 


6.2.1.1 Fundamental properties 

Let us begin by investigating some of the fundamental properties. First, just like in the 
case of the Fourier transform on T", the Fourier transform decays at oo. 

Lemma 6.2.2 (Riemann-Lebesgue lemma) The Fourier transform of any f e is 

an element ofCa^fMT), i. e. Ff is continuous, bounded and decays at infinity. 


lim J-/(?) = 0. 
l?|-»oo 


Proof The first part, Ff e n C(]R"), has already been shown on page 42. It 

remains to show that Ff decays at infinity. But that follows from the fact that any inte- 
grable function can be approximated arbitrarily well by a finite linear combination of step 
functions 



X eA 
X ^a’ 


and lim|j|^oo(J'l^)(^) = 0. Thus, = 0 follows. □ 

We begin to enumerate a few important properties of the Fourier transform. These are 
tremendously helpful in computations. 

Proposition 6.2.3 (Fundamental properties of Let f e 


(i) Ff[^) = [FfX-^) 

(ii) (.F/(--))(0 = (-^/X-?) 

(Hi) iF(Tyf))(^) = (Ff)(E,) where (Tyf)(x) := /(x - y)for y e R" 

Civ) (J-/)(?-r,)=(J-(e+‘''-/))(0 

(V) (F(SJ))(^) = A" (Ff)(XQ where (SJ)(x) := / (-/ a ), A > 0 

(vi) For all f e CXM") with 5“/ e L^R"), |ct| < r, we have = il“l (Ff)(^) 


for all |a| < r. 

Lemma 6.2.4 (Fourier transform of a GaulSian) The Fourier transform of the Gaufiian 


function gx(x) : 
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Proof By the scaling relation in Proposition 6.2.3, it suffices to prove the Lemma for 
A = 1. Moreover, we may set n = 1, because 


giw=n 




is just the product of one-dimensional Gaufiians. Completing the square, we can express 
the Fourier transform 






dxe ‘^'^e 2^^ 




dxe 25 e 2 


as the product of a GauRian with 






dx e 2 


-Ux+i^f 


A simple limiting argument shows that we can differentiate / under the integral sign as 
often as we would like, i. e. / e C“(]R), and that its first derivative vanishes. 


d 1 

d? J 








dx (—i(x-F i^)) e 
dx i— 

dx V y 


Orr L J -00 




But a smooth function whose first derivative vanishes everywhere is constant, and its value 
is 


/(O) 


1 


r 

dxe 2 ^ 

Jh 


= 1 . 


□ 


The Fourier transform also has an inverse: 


2013.11.26 
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6 The Fourier transform 


Proposition 6.2.5 Assume f e is such that also its Fourier transform f 

integrable. Then for this function, the inverse Fourier transform 


J^f is 


(.F-VXx) = 


( 271 ) 


n/2 


d ^ e +‘«-/(0 = ( J -/)(- x )=/ 


agrees with f e L^CR"). 


We will postpone the proof until the next subsection, but the idea is that in the sense of 
distributions (cf. Section 7) one has 

dxe+‘«-^ = (27r)"5(?) 

JR" 


where 5 is the Dirac distribution. A rigorous argument is more involved, though. 


6.2.1.2 The convolution 


The convolution arises naturally from the group structure (a discussion for another time) 
and it appears naturally in the discussion, because the Fourier transform intertwines the 
pointwise product of functions and the convolution (cf. Proposition 6.2.7). 

Definition 6.2.6 (Convolution) We define the convolution of f,g e L^(]R") to be 


if *g)M := 


dy/(x-y)g(y). 


We have seen in the exercises that * : x L^(]R") —> L^(]R''), i. e. the convolution of 

two functions is again integrable. The Fourier transform intertwines the convolution 
and the pointwise product of functions: 

Proposition 6.2.7 J-if * g) = iff Tg holds for all f ,g ^ 

Proof For any f,g^ L^(R") also their convolution is integrable. Then we obtain the 
claim from direct computation: 


iHf*gm) 


{iTiyh 

1 


dxe *g)(x) 


dx 


(271)"/^ 
(27i)"/^J-/(?)J^g(?) 


dye-‘«"-^)e-‘5-3'/(x-y)g(y) 


□ 


Also convolutions on L^(R") have approximate identities. 
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6.2 The Fourier transform on R' 


Definition 6.2.8 (Approximate identity) An approximate identity or Dirac sequence is 
a family of non-negative functions (5g)jg(o,eo) ^ Cq > with the following two 

properties: 

^ ^ holds for all e e (0,eo). 


(ii) For any R> Owe have lim 

e—0 


dx 5j(x) = 1. 


|x|<R 


The name “approximate identity” again derives from the following 

Theorem 6.2.9 Let (5g)ngfj be an approximate identity. Then for all f e L^(R") we have 


lim 5^*/ 


, = 0 . 


The interested reader may look up the proof in [LLOl, Chapter 2.16]. 

Example Let x ^ L^CR") be a non-negative function normalized to 1, ||z||ii(R>.) = 1- 
Then one can show that 


5,(x) ■.= k-xikx) 


is an approximate identity. 

With this in mind, we can show that 
Lemma 6.2.10 C^CR") is dense in L^(R"). 

Proof We will only sketch the proof: Using the linearity of the convolution and the de¬ 
composition 

/ = (/Re+ - /Re-) + i (/im-r “ /im-) ^ L^IR") 

implies we may assume / > 0 without loss of generality. 

First, we smoothen / by convolving it with a smooth approximate identity, because 


d:i5,*f)=idf5,)*f 


holds as shown in the homework assignments. Clearly, the convolution of two non¬ 
negative functions is non-negative. One may start with x ^ C^CR") and then scale it 
like in the example. 

To make the support compact, we multiply with a cutoff function, e. g. pick a second 
function p e C“(R") taking values between 0 and 1 which is normalized to 1 = 
and satisfies 


p(x) 


kl<i 

\o |x|>2‘ 
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Clearly, := /r(^/j) ^ °°> 1 almost everywhere in x, and thus fij (^5^. *f) converges to 
5^. *f as j oo by the Monotone Convergence theorem. Thus, := (5;^ */) ^ C“(]R") 

converges to / in L^(R"). □ 

6.2.2 Decay of Fourier transforms 

We can prove an analog of Theorem 6.1.15. 

Theorem 6.2.11 (Regularity / decay J^/) (i) Let s e Nq, 5 e (0,1) and assume 

that the Fourier transform of f & L^(]R") decays as 

|/(0|<C(1 + |?|)-''-^-^ (6.2.1) 

Then f e and dff e L°°(R'^) holds for all |a| < s. 

(ii) Assume f e C^CR") is such that all derivatives up to order s are integrable. Then the 

Fourier transform iFf satisfies lim (liCT/C?)) = 0/or r < s. 

ICI-.0O 

(in) f G C^CR"), df^f G L^(R'')/or all a G Nq holds if and only if for allr >0 there exists 
Cr > 0 such that 

|/(0|<C,(l + |^|)-^ 

Proof (i) The decay (6.2.1) implies / and ‘^“/, |a| < s, are in fact integrable. Thus, 
the inverse 

(J-l(ila| ^«/))(x) = 5;(J-/)(-x) = 5 ;/ 

exists as long as |a| < s and is an element of L“(R") by the Riemann-Lebesgue 
lemma 6.2.2. 

(ii) This is a consequence of Proposition 6.2.3 and the Riemann-Lebesgue lemma 6.2.2. 
(hi) Just like in the discrete case, this follows immediately from (i) and (ii). □ 

6.2.3 The Fourier transform on 

The difficulty of defining the Fourier transform on L^’CR") spaces is that they do not nest 
akin to Lemma 6.1.6. Hence, we will need some preparation. First of all, it is easy to see 
that the convolution can also be seen as a continuous map 

* : lHk") X LPCR") ^ LP(R"), 

Moreover, convolving with a suitable approximate identity is a standard method to regu¬ 
larize L^'CR") functions: 
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Lemma 6.2.12 If (5^) is an approximate identity and f G 1 < p < oo, then 5^*/ 

converges to f in L^'(R"). 

The interested reader may look up the proof in [LLOl, Chapter 2.16]. Hence, we can 
generalize Lemma 6.2.10 to 

Lemma 6.2.13 C^CR") is dense in Lf’(R'*)/or 1 < p < oo. 

An immediate consequence of the Lemma is that L^CR") n L^CR") is dense in L^CR"), 
because C“(R") c L^(R") n L^(R") c L^’(R") lies densely in LPCR"). 

The Gaufiian can be conveniently used to define the Fourier transform on L^CR"). 

Theorem 6.2.14 (PlanchereTs theorem) Iff e L^(R")nL^(R''), thenf is in L^(R") and 
has the same L^(R'’)-norm as f, 


Hence, f <—>■ IFf has a unique continuous extension to a unitary map IF : L^(R") —> L^(R"). 
Moreover, Parseval’s formula holds, i. e. 




holds for all f ,g ^ L^(R"). 

Proof Initially, pick any / e L^(R") n L^CR"). Then according to the Riemann-Lebesgue 
lemma 6.2.2, Ff is essentially bounded, and hence 

d?|/(?)|'e-i?' (6.2.2) 


is finite. Since / e L^(R") by assumption, the function /(x)/(y)e“ 2 ^^ depending on the 
three variables (x, y, Ef) is an element of L^CR^"). Then writing out the Fourier transforms 
in the above integral and integrating over E,, we obtain 


(6.2.2) = 


(271)" 

1 

( 271 )"/^ 

r 


d? 


dx 


dx 


dye ^VU)/(y)e 


dy/(x)/(y)(27i)"^"e "^"e 


dy/(y)(e 2 .^ */)(y). 


Then by Lemma 6.2.12 the function e "^^e 2 *^^ >i:/ converges to f in as e 0, and by 
Dominated convergence the above expression approaches ||/|| . On the other hand, the 
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2013.11.28 


above is equal to (6.2.2) ; Moreover, we may interchange integration and limit e ^ 0 in 
(6.2.2) by the Monotone Convergence theorem, and thus we have proven ||/|| = ||/|| if 
/ eLi(]R")nL2(R"). 

By density of L^(]R") n L^(R"), this equality extends to all / e L^(R") and with the help 
of Theorem 5.1.6, we deduce that the Fourier transform extends to a continuous map 
: L^CR") —> 

Parseval’s formula (/,g) = {iFf,fFg) follows from the polarization identity 

{f,g) = ^(||/ + ?ir-i||/ + i?ir-Cl-i)||/||^-(l-i)||g|P). 

Parseval’s formula also implies the unitarity of F. □ 

The above Theorem also gives a definition of the Fourier transform on L^(R"). 

6.2.4 The solution of the heat equation on E" 

Just like in the discrete case, the Fourier transform converts the heat equation, a linear 
PDF, into a linear ODE. This connection explains how to arrive at the solution to the heat 
equation given in problem 18: we first Fourier transform left- and right-hand side of 

d^u[t) = +D A^u(t), u(0) = Uq e L^(R"), (6.2.3) 

and obtain the heat equation in Fourier representation, 

dfi[t) =-D^^u[t), u(0) = Uq eCooCR"). (6.2.4) 

Here, u and Uq are the Fourier transforms of u and Uq, respectively, and D > 0 is the 
diffusion constant. stands for the multiplication operator associated to the function 
f i. e. we set 


Since the Laplacian in Fourier representation is just a multiplication operator, the solution 
to (6.2.4) is 

ti(t) = e-^«?\eOR''). 

The first factor is just a GauEian, hence u(t) is integrable for t > 0 and its inverse 

Fourier exists. The solution in position representation is just the inverse Fourier transform 
applied to u(t): Proposition 6.2.7 tells us that u(t) can be seen as the convolution of 
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the inverse Fourier transform of e convolved with Uq, and using Lemma 6.2.4 we 
compute 

u(t) = J-i(e-^^?'uo) 

1 

=-re “tot *Un=: G(t)*Ur,. (6.2.5) 

Note that the right-hand side exists in L^(]R") as the convolution of two L^(]R") func- 
tions. Moreover, one can show that (^^^^^e“4D7 is a Dirac sequence as t \ 0 so that 
limj^o u(t) = Uq holds where the limit is understood in the L^CR")-sense. 

Uniqueness of solutions to the heat equation Let us revisit a topic that has not seen 
much attention up to now, namely whether solutions exist for all times and whether they 
are unique. 

Theorem 6.2.15 The initial value problem (6.2.3) has (6.2.5) as its unique solution if we 
require u{t), d^u^t) G L^(R") to hold for t > 0. 

Proof The arguments preceding this theorem show that u(t) as given by (6.2.5) defines a 
solution which is integrable for t > 0; moreover, computing Aj.G(t) shows that it is also 
integrable, and hence, 3tu(t) e L^(R") is also established. 

So assume u(t) is another integrable solution to (6.2.3) with df-u^t) e L^CR") and define 
the difference 


g(t) :=u(t)-u(t). 


Clearly, this difference vanishes as t = 0, i. e. g(0) = 0. Since u(t) and u(t) as well as 
their time-derivatives are integrable, also g(t), d^gCt) e L^(R"). 

The Riemann-Lebesgue Lemma 6.2.2 implies that the Fourier transform of the differ¬ 
ence g(t) := J-A(t) is an element of Coo(R"). Hence, equations (2.2.7) and (6.2.4) yield 


d 

dt 




< 


d 






V^gR", 


which is then the initial estimate for the Grdnwall lemma 2.2.6, 

0 < |g(t,?)| < |g(0,0| = |g(0,e^^«' = 0. ^ 

Since g(t, is continuous in this shows that (6.2.5) is the unique solution in L^CR"). 


2014.01.07 
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6 The Fourier transform 


To show that the condition u(t) e is crucial for the uniqueness, we give a concrete 

counterexample first found by Tychonoff [Tyc35]. For simplicity, we reduce to the one¬ 
dimensional case and set the diffusion constant D to 1. Define the function 

1 

u{t,x) := e'*'4(1-0 

Vl-t 

for t e [0,1). A simple computation yields that u(t) satisfies the heat equation 

d^u^t) = +d^uit) 


to the initial condition u(0,x) = e 4 ; 


3,u(t,x) = +1 (1 - t)-^/^ e^ci^ + (1 - • (+1) • 

2(l-t) + x^ _F_ 


0 4(l-t) 


4(1 - 


. 0 4(1-0 


5^u(t,x) = 

dMt,x) = 


0 4(1-0 


. 0 4(1-0 + • 


2(1 - tff 
1 

2(1- ty/^'' ' 4(1 - tff 

2(l-t)-Fx2 ^ 

- 0 4(1-0 

4(1 - 


0 4(1-0 


Clearly, the solution explodes as t y 1 and is not integrable for t e [0,1). 

More careful study shows that asking for u(t) e L^(]R'') is a lot stronger than neces¬ 
sary. In fact, we will see in the next chapter that requiring the solution u(t) to remain a 
tempered distribution suffices to make the solution unique. 


6.2.5 The solution of the free Schrodinger equation on 

Now we will apply this to the free Schrodinger operator H = — where the natural 
space of solutions is L^(]R"). Rewriting the free Schrodinger equation in the momentum 
representation yields 


J'(i3t'i/)(t)) = id^Tipit) = J'(-|A^T/)(t)) 

Parseval’s theorem 6.2.14 tells us that T : L^(R") —> L^(R") is a unitary, and thus we 
can equivalently look for 'i)i(t) = iF[xp[t)) e L^(]R") which solves the free Schrodinger 
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equation in the momentum representation, 

If we compute the right-hand side at t = 0 and assume i/jq ^ n L^CR"), by Propo¬ 

sition 6.2.3 (vi) this leads to 

We will revisit this point in Chapter 7.1. Note that —A^xpQ e L^CR") since this is precisely 
the domain of definition of H, and thus V^H^) consists of those L^(R'')-functions {p for 
which tp is also in L^(R"). 

Again, the Fourier transform converts the PDF into the linear ODE 

id^xpit) = ip[t), '{p[0) = TipQ = 'ipo, (6.2.6) 

which can be solved explicitly by 

'ip[t)=U^[t)'ipo:=e~^‘2^^'ipo. (6.2.7) 

The unitary evolution group associated to ^ is the unitary multiplication operator 
= e““^ 2 ^^, and hence, the evolution group generated by H = — ^is 

D(t) = 

U[t) is also unitary: U^[t) and T are unitary, and thus 
D(t)D(t)*= 

= T* e-'" J-J"* e+‘' = id^if^nj. 

Similarly, one deduces D(t)* D(t) = id^ 2 (Rnj. One may be tempted to follow the computa¬ 
tion leading up to (6.2.5) , replace t by —it and write the solution 

xp[t) = p[t)*xpo (6.2.8) 


as a convolution of the initial condition xpQ with the function 

e+'^ 

From a technical point of view, the derivation of (6.2.8) is more delicate and will be 
postponed to Chapter 7. 
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6 The Fourier transform 


The uncertainty principle One of the fundamentals of quantum mechanics is Heisenberg’s 
uncertainty principle, namely that one cannot arbitrarily localize wave functions in position 
and momentum space simultaneously. This is a particular case of a much more general 
fact about non-commuting (quantum) observables; 

Theorem 6.2.16 (Heisenberg’s uncertainty principle) LetA,B : TL —> TL be two bounded 
selfadjoint operators on the Hilbert space TL. We define the expectation value 

E^[A):= {xP,AxP) 

with respect to %p ^TL with ||'!/)|| = 1 and the variance 

a^(Af :=E^((A-E^(A))') 

Then Heisenberg’s uncertainty relation holds: 

i|E^(i[AB])| < (6.2.9) 

Proof Let ip ^TL he an arbitrary normalized vector. Due to the selfadjointness of A and 
B, the expectation values are real, 

E,/,(A)= {'4),Aip) = {A*ip,ip) = {Alp,Ip) 

= {iP,Aip) =E^[A). 

In general A and B will not have mean 0, but 

A := A - E.0(A) 

and B :=B — E^(B) do. Hence, we can express the variance of ^4 as an expectation value: 

a^iAf = E^ ((A - E^(A))') = E^ (A^) 

Moreover, the commutator of A and B coincides with that of A and B, 

[A,B] = [A,B] - [E,^(A),B] - [AE,^(B)] + 

= [A,B]. 


Then expressing the left-hand side of (6.2.9) in terms of the shifted observables A and B, 
and using the Cauchy-Schwarz inequality as well as the selfadjointness yields Heisenberg’s 
inequality. 


|E,^(i[AB])| = |e^([AB])| = {iP,ABiP) - {iP,MiP) 

<\{AiP,BiP)\ + \{BiP,AiP)\<2\\AiP\\\\BiP\\ 

= 2 \/ {Alp,Alp) \/ {Bxp,Bxp) = 2 \/ {ip,A^ip) 'J {ip,B^ip) 
= 2(7,^(A)ct,^(B). 
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6.2 The Fourier transform on R' 


Often Heisenberg’s inequality is just stated for the position observable Xj (multiplication 
by Xj) and the momentum observable even though these are unbounded self- 

adjoint operators (cf. the discussion in Chapters 5.3-5.4), this introduces only technical 
complications. For instance, the above arguments hold verbatim if we require in addition 
Ip e c and vectors of this type lie dense in L^(R"). Then the left-hand 

side of Heisenberg’s inequality reduces to V 2 because 

[xj, Ip = Xj i-ihd^^ip) - (-ift)5^, (x_, i/j) 

= ih5i,j^ 


and xp is assumed to have norm 1, 

( 6 . 2 . 10 ) 

Skipping over some of the details (there are technical difficulties defining the commutator 
of two unbounded operators), we see that one cannot do better than V 2 but there are 
cases when the right-hand side of (6.2.10) is not even finite. 

The physical interpretation of (6.2.9) is that one cannot measure non-commuting ob¬ 
servables simultaneously with arbitrary precision. In his original book on quantum me¬ 
chanics [Hei30], Heisenberg spends a lot of care to explain why in specific experiments 
position and momentum along the same direction cannot be measured simultaneously 
with arbitrary precision, i. e. why increasing the resolution of the position measurement 
increases the error of the momentum measurement and vice versa. 
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Chapter 7 

Schwartz functions and tempered 
distributions 


often one wants to find more general solutions to PDEs, e. g. one may ask whether the 
heat equation makes sense in case the initial condition is an element of L°°(R'')? A very 
fruitful ansatz which we will explore in Chapter 8 is to ask whether “weak solutions” to a 
PDE exist. Weak means that the solution may be a so-called distribution which is a linear 
functional from a space of test functions. 

Schwartz functions S(R‘^) are such a space of test functions, i. e. a space of “very nicely 
behaved functions”. The dual of this space of test functions, the tempered distributions, 
allow us to extend common operations such as Eourier transforms and derivatives to ob¬ 
jects which may not even be functions. 


7.1 Schwartz functions 


The motivation to define Schwartz functions on comes from dealing with Eourier trans¬ 
forms: our class of test functions has three defining properties: 

(i) forms a vector space. 

(ii) Stability under derivation, c for all multiindices a e Nq and / e 

we have 5“/ e 

(iii) Stability under Fourier transform, iFS(R‘^) C for all / e 5(]R‘*), the Eourier 

transform 




(271)'*/^ 


dx e=F“-^/(x)e5(]R‘') 


(7.1.1) 
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7 Schwartz functions and tempered distributions 


is also a test function. 

These relatively simple requirements have surprisingly rich implications: 

(i) iSCK'*) c i. e. any / e and all of its derivatives are integrable. 

(ii) if : 5(R‘*) —> 5(]R'^) acts bijectively: if / e 5(R'^) c then iff e 5(R‘*) c 

lHr''). 

(iii) For all a e Nq, we have = x“ Tf e 5(R‘^). This holds as all derivatives 

are integrable. 

(iv) Hence, for all a, a e Nq, we have e 5(R‘*). 

(v) Translations of Schwartz functions are again Schwartz functions, /(• — Xq) e 5(R‘*); 
this follows from J^/(- — Xq) = e“‘^'^'> Tf e 5(R‘^) for all Xq e R‘*. 

This leads to the following definition: 

Definition 7.1.1 (Schwartz functions) The space of Schwartz functions 

:= {/ e C^CR'') | Va, a e : ||/< oo} 
is defined in terms of the family of seminorms^ indexed by a, a & Nq 

||/||^„:= sup|x“a;/(x)|, /eC^CR'*). 

xeK'* 

The family of seminorms defines a so-called Frechet topology: put in simple terms, to make 
sure that sequences in 5(R‘*) converge to rapidly decreasing smooth functions, we need 
to control all derivatives as well as the decay. This is also the reason why there is no 
norm on S(R‘^) which generates the same topology as the family of seminorms. However, 
IklLa “ ® a, a e Nq ensures / = 0, all seminorms put together can distinguish 

points. 

Example Two simple examples of Schwartz functions are 

/(x) = e-“^ a>0. 


and 


gM 


|x|<l 

0 |x| > 1 ■ 


The second one even has compact support. 

^ A seminorm has all properties of a norm except that ||/1| =0 does not necessarily imply / = 0. 
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7.1 Schwartz functions 


The first major fact we will establish is completeness. 


Theorem 7.1.2 The space of Schwartz functions endowed with 


d(/,g) 


00 


22-" sup 

n=0 |a|+|a|=n 


\f-s\\a 
l + ||/-g 


is a complete metric space. 


Proof d is positive and symmetric. It also satisfies the triangle inequality as x 
concave on Rj and all of the seminorms satisfy the triangle inequality. Hence, d) 

is a metric space. 

To show completeness, take a Cauchy sequence (/„) with respect to d. By definition and 
positivity, this means (/„) is also a Cauchy sequence with respect to all of the seminorms 
lljlaa. Each of the (x“3^“/n) converge to some as the space of bounded continuous 
functions Cb(R‘*) with sup norm is complete. It remains to show that = ^°^xSoo- 
Clearly, only taking derivatives is problematic: we will prove this for |a| = 1, the general 
result follows from a simple induction. Assume we are interested in the sequence 
fc e {1,..., d}. With «(. := (0,..., 0,1,0,...) as the multiindex that has a 1 in the fcth entry 
and e^. := (0,..., 0,1,0,...) e S'* as the fcth canonical base vector, we know that 




fnM = fnix-X^e^) + j 


^9xJn{x + (s-x^,)e,,) 


as well as 


goo(^) = goo(^ - Xkek) + 


ds3,^goo(^ + (5-^fc)efc) 


Jo 


hold since /„ ^ goo and ^ goa^ uniformly. Hence, goo e C^CR'^) and the derivative 
of goo coincides with go^^, dx^Soo = goa^- We then proceed by induction to show goo e 
C“(R‘*). This means d(/„, goo) ^ 0 as n ^ oo and iSCR'*) is complete. □ 


The fP norm of each element in 5(R‘*) can be dominated by two seminorms: 

Lemma 7.1.3 Let f e 5(R‘*). Then for each 1 < p < oo, the norm off can be dominated 
by a finite number of seminorms, 


LP(K‘‘) 


< Ci(d) 


„„ + CjCd) max 

°° |a|=2n(d) 


aO ’ 


where Ci(d), C 2 (d) e R^ and n[d) e No only depend on the dimension o/R'*. Hence, 
f &LP(R'^l 


2014.01.09 
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Proof We split the integral on into an integral over the unit ball centered at the origin 
and its complement: let := max| 2 |= 2 n ||/ H^o’ 


LP(K‘') ■ 


dx|/(x)n <( dx|/(x)n +( dx|/(x)n 

^ ^dhi<i J vjp|>i J 


< 


00 


Vp f r 

dxl + 


hl<i 
< Vol(Bi(0))'/'’ 


hl>i 


dx |/(x)f 


Ixl^'’^' 


Vp 


00 


+ B„ 


hl> 


1 h'/'’ 

dx —I . 

1 


If we choose n large enough, |x| is integrable and can be computed explicitly, and we 
get 


LP(K‘*} 


< Ci(d) 


00 


+ CjCd) max 

|a|=2n 


aO ' 


This concludes the proof. 

Lemma 7.1.4 The smooth functions with compact support are dense in 5(]R‘*). 

Proof Take any / e and choose 


gM 



kl<i 

|x|>r 


Then/n := gQ/n)f converges to / in S(R‘^), i. e. 


lim 

n—>oo>' 


holds for all a, a e Nq . □ 

Next, we will show that if : 5(]R‘*) —> is a continuous and bijective map from 

onto itself. 

Theorem 7.1.5 The Fourier transform T as defined by equation (7.1.1) maps con¬ 

tinuously and bijectively onto itself The inverse fF~^ is continuous as well. Furthermore, for 
all f e 5(]R‘*) and a, a e Nq, we have 

J-(x“(-i3J“/) = (+iBg)“^“J-/. (7.1.2) 
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7.1 Schwartz functions 


Proof We need to prove i5^)“/) = first: since x^d^f is integrable, 

its Fourier transform exists and is continuous by Dominated Convergence. For any a, a e 
Nq, we have 


(.F(x“(-iaj“/))(o 


1 

(271)“/^ 


dxe-“-^x“(-ia^)“/(x) 

dx(+iag)“e-‘-«(-i5J“/(x) 


(271)'*/- 


■(+is^r 


dx e-“-^(-i3^)“/(x). 


In the step marked with we have used Dominated Convergence to interchange integra¬ 
tion and differentiation. Now we integrate partially |a| times and use that the boundary 
terms vanish, 


(j-(x“(-i5J“/))(?) 




(2nyf 

1 

(27iyf 

((+w^rr^f)(a 


dx (-Fi5j.)“e 


a -ix-f 


dx?“e-“-^/(x) 


f(x) 


To show is continuous, we need to estimate the seminorms of J"/ by those of /: for 
any a, a e Nq, it holds 

||J-/||^^= sup|(r3“.F/)(0| = sup|(j-(a;x“/))(x) 

geR"*' ^ ^ 

In particular, this implies fFf e 5(1^^*). Since d^x°-f e we can apply Lemma 7.1.3 

and estaimte the right-hand side by a finite number of seminorms of /. Hence, F is 
continuous: if/„ is a Cauchy sequence in S(R‘^) that converges to /, then Ff,^ has to 
converge to Ff e 2014.01.14 

To show that is a bijection with continuous inverse, we note that it suffices to prove 
F~^Ff = f for functions / in a dense subset, namely C“(]R‘^) (see Lemma 7.1.4). Pick/ 
so that the support of is contained in a cube = [—n, -Fn]‘* with sides of length 2n. We 
can write / on as a uniformly convergent Fourier series, 

fnM= Xi 
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with 




1 

Vol(Wj 






( 271 )"/" 1 

li2nY {2Tifl^ 




dxe-“-«/(x). 


The second equality holds if n is large enough so that supp / fits into the cube [—n,+n]‘*. 
Hence, /„ can be expressed as 


fnM= ^ 


1 tt'* 




which is a Riemann sum that converges to 


fM 


(2ny/^ j 


dxe-5(j-/x?)=(J-ij-/)(x) 


as iff e S. This concludes the proof. 


□ 


Hence, we have shown that has the defining properties that suggested its moti¬ 

vation in the first place. The Schwartz functions also have other nice properties whose 
proofs are left as an exercise. 


Proposition 7.1.6 The Schwartz functions have the following properties: 

(i) With pointwise multiplication ■ : 5(R‘*) x 5(R‘*) —> iSCR'^), the space of forms 
a Frechet algebra (i. e. the multiplication is continuous in both arguments). 

(if) For all a, a, the map f x°d^f is continuous on iSCR'^). 

(Hi) For any Xq e R‘*, the map /(• — Xq) continuous on 5(R‘^). 

(iv) For any f e iSCR'*), — /) converges to as h ^ 0 where Cj. is the kth 

canonical base vector o/R'*. 

The next important fact will be mentioned without proof: 

Theorem 7.1.7 5(R‘*) is dense in L^CR'^), 1 < p < oo. 

This means, we can approximate any L^CR'*) function by a test function. We will use this 
and the next theorem to extend the Fourier transform to L^(R‘*). 

Theorem 7.1.8 For all f,g^ iSCR'^), we have 

' r 

dx(Tf)(x)g(x)= dxf(x)(Tg)(x). 

Jw^ Jr‘* 

This implies {(Ff,g) = {f,(F~^g) and {(Ff,(Fg) = {f,g) where {•,•) is the usual scalar 
product on L^(R'^). 
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7.1 Schwartz functions 


Proof Using Fubini’s theorem, we conclude we can first integrate with respect to ^ instead 
of X, 


' 

dJt (^/)U)gU) 

Jk'' 


c 1 '■ 

dx- -T- 

r 


d?/(?) 




d?e--«/(?)gW 

dxe-“-^g(x) = 


d?/(?)(-FgXO. 

Jk'* 


To prove the second part, we remark that compared to the scalar product on we 

are missing a complex conjugation of the first function. Furthermore, [IFfT = iF~^f* 
holds. From this, it follows that {iFf,g) = {f,iF~^g) and upon replacing g with iFg, that 
{J^f,J^g) = {f,:F-^:Fg) = {f,g). □ 


Consequently, the convolution defines a multiplication on 

Corollary 7.1.9 c 

Proof Let f,g^ Because Schwartz functions are also integrable, f * g exists in 

and satisfies iF[f * g) = Fg (Proposition 6.2.7). This means we can 

rewrite f * g = (Tf J^g) as the Fourier transform of a product of Schwartz 

functions, and thus / * g e iSCR'*). □ 

Now we will apply this to the free Schrodinger operator H = — First of all, we 
conclude from Theorem 7.1.7 that the domain of H, 

^(R'') c V{H) = {(p G lXR'*) I - G L^CR^*)} c L^R''), 

is dense. Since derivatives of Schwartz functions are Schwartz functions, H maps iSCR'*) 
to itself, and we deduce that the solution 

t/)(t) = U[t)%pQ = F .F“ Vo 


to initial conditions t/'o ^ c L^(R‘^) remains a Schwartz function: F"^^ leaves S(R‘^) 

invariant (Theorem 7.1.5) as does multiplication by e““^ 2 ^ , because derivatives of that 
function are of the form polynomial times e““^ 2 ? . 

For these initial conditions, we can also rigorously prove equation (6.2.8) : 

Proposition 7.1.10 Let ipQ g 5(R'^) c Then for t / 0 the global solution of the 

free Schrodinger equation with initial condition ipois given by 


t/)(t,x) 


(27rit)‘'/^ J 


■ (x-y)2 

dy e‘ 2 , i/,(,(y) 


•* 

dyp(t,x-y)t/)o(y). 

Jk'' 


(7.1.3) 


This expression converges in the norm to ipQ as t 0. 
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Proof We denote the Fourier transform of e ‘^2 by 

[/(t) := 


If t = 0, then the bijectivity of the Fourier transform on Theorem 7.1.5, yields 

1 /( 0 ) = id 5 . ^ 

So let t 7 ^ 0. If {pQ is a Schwartz function, so is e““^ 2 ^ xpQ. As the Fourier transform is a 
unitary map on L^(R‘*) (Proposition 6.1.17) and maps Schwartz functions onto Schwartz 
functions (Theorem 7.1.5), l/(t) also maps Schwartz functions onto Schwartz functions. 
Plugging in the definition of the Fourier transform, for any ipo& 5(R‘*) and t 7 ^ 0 we can 
write out U[t)ipQ as 


(.F-ie-‘^i«Vt/.o)(x) = 


d? 


(271)''^ J 


(271)^* 
dy e' 




d^e+‘^-?e"‘'i?^ 

1 


(271)“/^ 




dye ‘^■^i/>o(y) 

t/)o(y). (7.1.4) 


We need to regularize the integral: if we write the right-hand side of the above as 


r. h. s. = lim 


e\0 

1 

: lim 


(271)''''^ 


d? 


dy e 


dy e‘ 


(«-y )2 


C«-y7 

2t 


(271)"/^ 




(271)'*/" 


d^e 






^o(y) 


' 0 o(y)> 


we can use Fubini to change the order of integration. The inner integral can be computed 
by interpreting it as an integral in the complex plane. 


1 

(271)'*/^ 


Jr^ 




1 

((e + i)/)"^' 


Plugged back into equation (7.1.4) and combined with the Dominated Convergence The¬ 
orem, this yields equation (7.1.3) . □ 


7.2 Tempered distributions 

Tempered distributions are linear functionals on ^(R^*). 

Definition 7.2.1 (Tempered distributions) The tempered distributions tire the con¬ 

tinuous linear functions on the Schwartz functions 5(R'^). If L & 5'(R‘^) is a linear func¬ 
tional, we will often write 

(L,/):=L(/) V/e5(R‘*). 


120 
























7.2 Tempered distributions 


Example (i) The 5 distribution defined via 


5(/)-/(0) 


is a linear continuous functional on (See exercise sheet 12.) 


(ii) Let g e 1 < p < oo, then for / e we define 


L,(f) 


' 

dxg(x)f(x)=: (g,f). 

Jm‘‘ 


(7.2.1) 


As / e c i + i = 1, by Holder’s inequality, we have 

|(gj)|<||g||p ||/||q- 

Since ||/1|^ can be bounded by a finite linear combination of Frechet seminorms of 
/, Lg is continuous, and the inclusion map i: is continuous. 

(iii) Equation (7.2.1) is the canonical way to interpret less nice functions as distributions: 
we identify a suitable function g : R‘* —> C with the distribution Lg. For instance, 
polynomially bounded smooth functions (think of g(x) = x^) define continuous 
linear functionals in this manner since for any g e C“j(R‘*), there exists n e Nq such 

that \/l + x^ g(x) is bounded. Hence, for any / e iS(R‘^), Holder’s inequality 
yields 


|U>/)| = 


dxg(x)/(x) 


dx ^/l + x^ g(x) \/l + x^ fix) 


< 


l+x2 g(x)||g„ ||Vl+x2 /(x)||gi 


Later on, we will see that this point of view, interpreting “not so nice” functions as 
distributions, helps us extend operations from test functions to much broader classes 
of functions. 


Similar to the case of normed spaces, we see that continuity implies “boundedness”. 

Proposition 7.2.2 A linear functional L : 5(R‘*) —> Cis a tempered distribution (i. e. con¬ 
tinuous) if and only if there exist constants C > 0 and fc, n e Nq such that 

|itf)|<eX:i|/|L 

\a\<k 

\a\<n 

for allf 
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7 Schwartz functions and tempered distributions 


Even though we will not give a proof, let us at least sketch its idea: because one has no 
control over the growth or decay of the seminorms ||/1|^^, maxima or sums of seminorms 
are finite if and only if only finitely many of them enter. 

As mentioned before, we can interpret suitable functions g as tempered distributions. In 
particular, every Schwartz function g e 5(]R‘*) defines a tempered distribution so that 




- 


dx gix)d^Jix)={g,-d^J) 

Jw^ 


holds for any / e We can use the right-hand side to define derivatives of distribu¬ 

tions: 


Definition 7.2.3 (Weak derivative) For a e Nq and L e we define the weak or 

distributional derivative of L as 


Example (i) The weak derivative of 5 is 


(5,/,/) = (5,-5,/) = -V(0). 


(ii) Let g G Then the weak derivative coincides with the usual derivative, by 


partial integration, we get 

= - 


dx g[x)d^J[x) 


= + 


dxd g[x)f[x). 


Similarly, the Eourier transform can be extended to a bijection Theo¬ 
rem 7.1.8 tells us that if g,f e then 

holds. If we replace g with an arbitrary tempered distribution, the right-hand side again 
serves as definition of the left-hand side: 

Definition 7.2.4 (Fourier transform on iS'(]R‘*)) For any tempered distribution L e 
we define its Fourier transform to be 

(.FL,/) :=(L,J-/) V/g5(R‘'). 
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7.2 Tempered distributions 


Example 


(i) The Fourier transform of 5 is the constant function (27i) 


iT5,f) = i5,Tf)=Tf[0) 

= ((27rr^^/). 


(.2nY/^ J 


dx/(x) 


(ii) The Fourier transform of makes sense as a tempered distribution on R: is a 

polynomially bounded function and thus defines a tempered distribution via equa¬ 
tion (7.2.1): 


{Tx\f) = {xY:Ff) = 


dxx 


(271)'/^ 


d? 


(271)'/- 

dx(+i)2 5|e-‘^-V(?) 


d ? e --?/(0 


= (-l)2.(-l) 


d? 


dxe-‘-« 


d?(27i)V^ 5(0 32/(0 
[2ny/^d^fW=i[2nf^5,-d^f) = (-(27r)'/^5",/) 


This is consistent with what we have shown earlier in Theorem 7.1.5, namely 
J-(x2/) = [+id^fTf = -dlTf. 


We have just computed Fourier transforms of functions that do not have Fourier transforms 
in the usual sense. We can apply the idea we have used to define the derivative and 
Fourier transform on 5/]R‘*) to other operators initially defined on S(R'^). Before we do 
that though, we need to introduce the appropriate notion of continuity on S'(R‘^). 


Definition 7.2.5 (Weak-* convergence) Let She a metric space with dual S'. ^4 sequence 
(Ln) in S' is said to converge to L ^S' in the weak-* sense if 


Lnif) 


n—*oo 




holds for all f &S. We will write w*- lim„_,oohn = L- 

This notion of convergence implies a notion of continuity and is crucial for the next theo¬ 
rem. 

Theorem 7.2.6 Let A : —> 5(]R'^) be a linear continuous map. Then for all L e 

S'iR'^l the map A : 5'(R‘*) —> ^'(R'*) 

iA'L,f):=(L,Af), f^SiR-^l (7.2.2) 

defines a weak-* continuous linear map. 2014.01.16 
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7 Schwartz functions and tempered distributions 


Put in the terms of Chapter 5.2, Pi is the adjoint of ^4. 

Proof First of all, Pi is linear and well-defined, PlL maps / e onto C. To show 

continuity, let (!„) be a sequence of tempered distributions which converges in the weak- 
* sense to L e iS'ClR'*). Then 

(A'L.J) = iL,,Af) — iL,Af) = (A'LJ) 

holds for all / e S(R'^) and Pi is weak-* continuous. □ 

As a last consequence, we can extend the convolution from * : x 

to 


* : X 

* : ^(R'*) X —> 5'(R‘*). 


For any f,g,h e we can push the convolution from one argument of the duality 

bracket to the other. 


= ig*f,h) 


dy(/ *g)(y)ft(y) 


^ r 

dy dx/(x)g(y-x)h(y) 


dx/(x)(g(- ■)*h)[x)= (/,g(- 

Jb"* 


Thus, we define 

Definition 7.2.7 (Convolution on iS'CR**)) Let L e S'(R‘^) tmd f e S(R‘^). Then the con¬ 
volution of L and f is defined as 


(L*/,g) := (L,/(--)*g) Vge^CR'*). (7.2.3) 


By Theorem 7.2.6, this extension of the convolution is weak-* continuous. Moreover, 
the convolution has a neutral element in 5'(R‘*), the delta distribution 5 = 5q: for all 
f,g&S(R^) 


(5*/,g) = (5,/(- O^g) = (/(- •)*g)(0) 

= dy/(-(0-y))g(y) = (/,g) 

Jr'' 

holds, and thus we can succinctly write 


5*/=/. 


(7.2.4) 
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7.3 Partial differential equations on iS'ClR'*) 


In view of this, we can better understand what Dirac sequences are (cf. Definitions 6.1.8 
and 6.2.8): since integrable functions define tempered distributions (cf equation (7.2.1) ), 
any Dirac sequence (5^) can be seen as a sequence of tempered distributions. More¬ 
over, the inclusion i: L^(R‘^) —> is continuous, the fact that 5^ */ converges to / 

in L^(R‘*), this sequence also converges in the distributional sense. Hence, 5^ ^ 5 holds 
in the distributional sense as e ^ 0. 

7.3 Partial differential equations on 

We have extended the most common operations, taking Fourier transform, derivatives 
and the convolution, from Schwartz functions to tempered distributions. Hence, we have 
managed to ascribe meaning to the partial differential equation 

LU:= Y, c[a)d^U = F 

\a\<N 


even if U,F e are tempered distributions, and we can ask whether LU = F has a 

solution. More precisely, the above equation means that 

{LU, >f)=Y 

\a\<N 


holds for all test functions ip e iSCR'^). This point of view is commonly used when con¬ 
sidering differential equations and apply them to functions for which derivatives in the 
ordinary sense do not exist. 

To give an explicit example, let us reconsider the heat equation 


3tu(t) = DA^u(t), 


u(0) = Uo- 


In the context of L^(R‘*), the unique solution u(t) = G(t) * Uq to the initial value problem 
(Theorem 6.2.15) involves the fundamental solution 


G(t,x) 


1 

(driDt)'*/" 


e 4Dt. 


Seeing as G(t) is a Gaui^ian for t > 0, it is also an element of iSCR^*), and thus convolving 
it with a bona fide tempered distribution makes sense. It stands to reason that 


G(t) := G(t)*Go 
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solves the heat equation for the initial condition 1/(0) = Uq e first of all, !7(t) 

satisfies the initial condition 1/(0) = Uq, because for all e 5 


lim(!7(t), ip) = lm(G(t) * Uq, ip) 
= lim([/o, G(t) * 


holds. Going from the first to the second line involves the definition of the convolution 
on equation (7.2.3) , as well as G(t, —x) = G(t,x). Given that G(t) is a Dirac 

sequence, the limit limj^o G(t) * p = ip exists in L^(R‘^)-, more specifically, the limit con¬ 
verges to the Schwartz function p, and we deduce 


|im(G(t),(p) = (Go,ip). 
Moreover, U[t) solves the heat equation on 



= {DA,G[t)*Uo,p) 


Hence, G(t) is a solution to the heat equation with initial condition Uq. Note that just like 
in the case of integrable functions, showing uniqueness involves additional conditions on 


d^U[t) and the initial condition Uq 


2014.01.21 


7.4 Other common spaces of distributions 

The ideas outlined in the last two sections can be applied to other spaces of test functions: 
one starts with a space of “nice” functions and the distributions are then comprised of the 
linear continuous functionals on that space. Operations such as derivatives are extended 
to distributions via the adjoint operator. 

Instead of Schwartz functions, often is used. However, working with this space 

is a little bit more unwieldy as it is not stable under the Fourier transform - which also 
implies that T does not extend to a map T : Moreover, one often 

works on bounded domains, i. e. sufficiently regular bounded subsets H c ]R‘*. Here, the 
distributions are the dual of C“(r2). Smoothness is also optional, for instance, the Dirac 
distribution is defined also on bounded, continuous functions, Cb(IR‘^), as =/(xo). 
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Chapter 8 

Green’s functions 


The basis of this chapter is equation (7.2.4) : assume we are interested in solutions of the 
inhomogeneous equation 


Lu = f (8.0.1) 

where we may take the differential operator L to be of the form 

\a\<N 

for instance. In addition, we may impose boundary conditions if this equation is defined 
on a subset of R'* with boundary. Formally, the solution to (8.0.1) can be written as 
u = L~^f in case L is invertible, but clearly, this is not very helpful as is. A more fruitful 
approach starts with the observation that also in the distributional sense 

^;(g*/) = (5;g)*/ = g*(5;/) 

holds true for any G e and / e Hence, if we can write the solution 

u = G*f as the convolution of the inhomogeneity / with some tempered distribution G, 
then G necessarily satisfies 

(Lu)(x) = L(G*/)(x)= (LG*/)(x) 

= /(x)= (5^,/), 

and we immediately obtain an equation for G, the Green’s function or fundamental solution, 
that is independent of /: 


LGM = 5^ 


( 8 . 0 . 2 ) 
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8 Green’s functions 


Once we solve this equation for G, we obtain a solution of (8.0.1) by setting 


u(x) = (G*/)(x) 


dy G(x-y)/(y). 

Jr'‘ 


The partial differential operator L as given above has translational symmetry. In a more 
general setting, where translational symmetry is absent, the Green’s function depends on 
two variables G(x,y) and the solution here is related to the inhomogeneity via 


u(x) 


dy G(x,y)/(y). 


The purpose of this chapter is to compute Green’s functions for specific cases and explore 
some of the caveats. Even though a priori it is not clear that Green’s functions are actually 
defined in terms of a function (as opposed to a bona fide distribution), in many cases it 
turns out they are. 


8.1 Matrix equations as a prototypicai exampie 

Another way to understand Green’s functions is to appeal to the theory of matrices: as¬ 
sume A e Matc(n) is invertible and we are looking for solutions x e C" of the equation 

Ax = y 

for some fixed y e C". We can expand y = terms of the canonical basis 

vectors = (0,..., 0,1,0,...), and if we solve 

Agj = Cj 

for all ; = 1,..., n, then we obtain x = Xi;=i Tj Sj the solution of Ax = y. Moreover, 
the matrix G := (gi| • • • Ig^) whose columns are comprised of the vectors gj satisfies 

x = Gy. 

Put another way, here G is just the matrix inverse of A. This already points to one funda¬ 
mental obstacle for the existence of Green’s functions, namely it hinges on the invertibility 
of A. 

The story is more complicated if A is not invertible. For instance, the equation Ax = y 
also makes sense in case A e Matc(n, m) is a rectangular matrix, and the vectors x e C™ 
and y e C" are from vector spaces of different dimension. Here, it is not clear whether a 
unique Ze/t-inverse G e Matc(m, n) exists: there may be cases when one can find no such 
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8.2 Simple examples 


G (Ax = y has no solution) or when there is a family of left-inverses (Ax = y does not 
have a unique solution). 

In the same vein, the Green's function G(x,y) gives the response at x to a unit impulse 
at y. The solution u(x) = G(x) * f at x can be seen as the “infinite superposition” of 
G(x,y)f(y) where the unit impulse at y is scaled by the inhomogeneity/(y). 


8.2 Simple examples 

To exemplify the general method, we solve (8.0.2) for a particularly simple case, the one¬ 
dimensional Poisson equation 

-dlu=f. ( 8 . 2 . 1 ) 


According to our discussion, we first need to solve 

-dlG(x,y') = 3(x-y') (8.2.2) 

for G. Put another way, G is the second anti-derivative of the Dirac distribution 5 which 
can be found “by hand”, namely 

G(x,y') = -p(x - y) -I- ax -I- b 


where 

(^x X e [0, -l-oo) 

It is instructive to verify (8.2.2) ; clearly, the term ax -I- b is harmless, because for smooth 
functions weak derivatives (meaning derivatives on 5'(]R‘*)) and ordinary derivatives (of 
C^(R'^) functions) coincide. Thus, —8^G(x,y) = —d^p(x — y) holds and (8.2.2) follows 
from the fact that the derivative of the Heavyside function 


0(x) 


jo xe(—00,0) 
X G [0, -l-oo) 


is the Dirac distribution. Note that it does not matter how we define p(0) and 6(0), this 
is just a modification on a set of measure 0. For instance, it is the size of the jump 


lim 6(x) — lim 6(x) = 1 

x/O x\0 


which matters, and its size is independent of how we define 6(0) and p(0). 
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8 Green’s functions 


So far we have derived the Green’s function G for the one-dimensional Poisson equation 
on all of R. G depends on the two parameters a, b e R, i. e. G is not unique. Additional 
conditions, e. g. boundary conditions, are needed to nail down a unique solution. For 
instance, we could restrict (8.2.1) to the interval [0,1] and use Dirichlet boundary con¬ 
ditions u(0) = 0 = u(l). Note that we need two condition to fix the values of the two 
parameters a and b. To determine the values of a and b (which depend on the second 
variable y), we solve G(0,y) = b = 0 and 

G(i,y) = -(i-y) + a = o 


for a and b, and obtain 


G(x,y) = -pix - y) + (1 - y)x 


(l-y)x xe[0,y] 
y(l-x) xe(y,l] ' 


The solution u(x) = G(x) * f satisfies the boundary conditions: from G(0,y) = 0 we 
immediately deduce 


u(0) 


' 1 

dyG(0,y)/(y) = 0 
Jo 


and similarly u(l) = 0 follows from G(l,y) = 0. 

Another example is the PDF from homework problem 42, 


- 4)i' = f, 

where the Fourier representation of L is the multiplication operator associated to the 
polynomial P(i^) = —— 4. The inverse of this polynomial enters the 
solution 

u(x) = — */(x), 

ZK 

one can then read off the Green’s function as 

G(x,y)=;^(.F-i(Vp))U-y). 


8.3 Green’s functions on 

The second example gives a strategy to compute Green’s functions on R‘^: the crucial in¬ 
gredient here is the translational symmetry of the differential operator L, i. e. L commutes 
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8.3 Green’s functions on 


with the translation operator [TyuXx) := u[x —y), y e R‘*. Explicitly, we will discuss the 
Poisson equation 

-A^u = f (8.3.1) 

in dimension 2 and 3 as well as the three-dimensional wave equation 

(8.3.2) 

Since the strategy to solve these equations is identical, we will outline how to solve Lu = f 
for a differential operator of the form 

L = 2c(a)a;. 

|a|<N 

Then in Fourier representation, 

iF(Lu) = if L Tu = Pu = f, 

the differential operator L transforms to multiplication by the polynomial 

P(?)= 1] i'“'c(a)r, 

|a|<N 

and u(x) = G[x)*f can be expressed as the convolution of the Green’s function 

G(x,y) := (271)"/" (J'-i(i/p))(x - y) 
with the inhomogeneity /. 

In other words, the problem of finding the Green’s function reduces to computing the in¬ 
verse Fourier transform of the rational function For instance, one can interpret 
as an integral in the complex plane, use the method of partial fractions and employ 
Cauchy’s integral formula. 

For the special case of the Poisson equation —A^u=f, another way to obtain the Green’s 
function relies on Green’s formula 


dx (A^u(x) v(x) — u(x) A^v(x)) = dS • (3„u(x) v(x) — u(x) 5„v(x)). (8.3.3) 


Here, V is a subset of R" with boundary dV and u or v has compact support. 


Theorem 8.3.1 (Green’s function for the Poisson equation) The Green’s function for the 
Poisson equation is 


G[x,y) 


- In X - y 

271 


d 7^ 2 
d = 2 


where Q = (d — 2)Area(S‘* and Area(S‘* is the surface area of the d — 1-dimensional 
sphere. 
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2014.01.23 


Proof Let us prove the case d / 2, the arguments for two-dimensional case are virtu¬ 
ally identical, one only needs to replace with ln|x|. First of all, and ln|x| 

define tempered distributions (use polar coordinates to show and arguments analogous 
to Lemma 7.1.3 to prove continuity). Moreover, away from x = 0, |x|^~‘^ = 0 and 

In |x| =0 hold in the ordinary sense. 

Now let ip e be arbitrary. Since smooth function with compact support are dense 

in we may assume ip e Define Vg := R'* \Bg where := {x e | |x| < 

e}. On Vg where we have cut out a small hole around the origin, A^ |x|^“‘* = 0 holds and 
we can write 


(-AJxr^(p)=-(|xr^A,(^) 


■ lim 

£\0 


dx |x|^ A^ipix) 


= lim 

e\0 


dx 


ipM-\xf A^ipix)j. 


Now Green’s formula applies. 


r 


, = lim 

e\0 


dS 


3V. 


= — lim 

e\0 


dS 


dB, 


[d,\xt^ (^(x)-lxp- 
([2-d)e^~‘^ ip[x)- 


5g(^(x)j 


and we obtain an integral with respect to dS, the surface measure of the sphere of radius 
e. Note that the minus sign is due to the difference in orientation (the outward normal 
on BVg points towards the origin while the surface normal of 5Bg points away from it). 
Since the surface area of Bg scales like the second term vanishes while the first term 
converges to 


(-A, Ixl^-^* ,p>]=(d- 2) Area(S‘*-i) <^(0) 

= (d - 2)Area(S‘*“^)(5, ip). 


8.4 Green’s functions on domains: implementing boundary 
conditions 

Predictably the presence of boundaries complicates things. To exemplify some of the 
hurdles, let us discuss the Poisson equation 

-A,u = f, u\g^ = h&C\dn), (8.4.1) 
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8.4 Green’s functions on domains: implementing boundary conditions 


on a bounded subset of with Dirichlet boundary conditions, i. e. we prescribe the 
value of u on the boundary (rather than that of its normal derivative). 

Imposing boundary conditions is necessary, because — is not injective; The kernel of 
—is made up of harmonic functions, i. e. functions which satisfy 


-A^h = 0. 


For instance, the function h(x) = e^' sinx 2 is harmonic. Consequently, if u is a solution 
to (8.4.1) , then u + h is another. However, it turns out that fixing u on the boundary dQ 
singles out a unique solution. 

The first step in this direction is the following representation of a sufficiently regular 
function on H: 


Proposition 8.4.1 Suppose Q. <z be a bounded domain with piecewise boundary dQ. 
and pick u e n Then for any x e we can write 


u(x) 


dy G(x,y)A^u(y) + 

J Cl 


+ 


dS, 


dn 


■ (^G(x,y)d„^u(y)-S„^G(x,y)u(y)j 


where the index y in the surface measure dS^ and the surface normal ny indicates that they 
are associated to the variable y. 


The idea here is to exploit — A^G(x, y) = 5(x — y) as well as Green’s formula (8.3.3) , and 
formally, we immediately obtain 


' 

dx (^AyG[x,y) u(y) - G(x,y) A^u(y)j = 


= -u(x)- 


dx G(x,y) A u(y) 


dS- (^d„G(x,y)u(y)-G(x,y) 5„u(y)j 

Jdn 


To make this rigorous, one has to cut a small hole around x and adapt the strategy from 
the proof of Theorem 8.3.1. However, we will skip the proof. 

Theorem 8.4.2 The Green’s function for the Dirichlet boundary value problem for the Pois¬ 
son equation on a bounded domain £1 c with boundary dfl has the form 


Gn(,x,y) = G(x,y)+ b(x,y) 
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8 Green’s functions 


where G[x,y) = —^ ln|x — y| is the Green’s function of the free problem and b(x,y) is the 
solution to the boundary value problem 


A^b = 0, b[x,y') = —G[x,y) Vx e dQ or Vy e dQ. 

Then the solution u to the inhomogeneous Dirichlet problem (8.4.1) is given by 


u(x) = 


dy Gnix,y)f[y)- 


dSy-3„ Gn(x,y)h(y). 


(8.4.2) 


(8.4.3) 


J ^ J do. 

A priori it is not clear whether (8.4.2) has a solution; we postpone a more in-depth dis¬ 
cussion of harmonic functions and proceed under the assumption it does. 


Proof Clearly, seeing as Gq is the sum of the free fundamental solution G and a harmonic 
function, AGn(x,y) = 5(x — y) still holds. Moreover, the Green’s function implements 
the boundary conditions, namely 


Gn(x,y) = G(x,y)-|- b(x,y) = 0 


is satisfied by construction on the boundary. 

Then Green’s identity (8.3.3) and A^b = 0 imply 


0 = - 


dy b(x,y)A u(y)-|- 


3n 


dSy • (^b[x, y) d„^u[y) - d„^b[x, y) u(y)]. 


The function u solves the inhomogeneous Poisson equation — A^u = / ; On the boundary, 
b(x,y) coincides with —G(x,y) and u(y) = h(y) holds, and we deduce 


dSy • G(x,y) d^ui^y) 

J dn 


r 


r 


dy b(x,y)/(y)- dSy ■ d^b{x,y)h{y). 

•/ J 3 VL 


This term cancels one of the boundary terms in the integral representation of Proposi¬ 
tion 8.4.1, and we recover equation (8.4.3) , 


u(x) 


' 

dy G(x,y)A^u(y)-|- 

J Cl 


+ 


dS, 


an 


■ (^G(x,y)d„^u(y)-a„^G(x,y)u(y)j 


dy Gn(x,y)/(y)- 

•J 



dSy- d^^Gn<ix,y)hliy). 


□ 
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Chapter 9 

Quantum mechanics 


The explanation of the photoelectric effect through light quanta is the name sake for 
quantum mechanics. Quantization here refers to the idea that energy stored in light 
comes in “chunks” known as photons, and that the energy per photon depends only on the 
frequency This is quite a departure from the classical theory of light through Maxwell’s 
equations (cf Chapter 5.5). 

The reader can only get a glimpse of quantum theory in this chapter. A good standard 
physics textbook on the subject is [Sak94] while the mathematics of quantum mechanics 
is covered in more depth in [Tes09; GSll]. 


9.1 Paradigms 


The simplest bona fide quantum system is that of a quantum spin, and it can be used 
to give an effective description of the Stern-Gerlach experiment where a beam of neutral 
atoms with magnetic moment g is sent through a magnet with inhomogeneous magnetic 
field B = (Bi.BjjBs). It was observed experimentally that the beam splits in two rather 
than fan out with continuous distribution. Hence, the system behaves as if only two spin 
configurations, spin-up j and spin-down i, are realized. A simplified (effective) model 
neglects the translational degree of freedom and focusses only on the internal spin degree 
of freedom. Then the energy observable, the hamiltonian, is the matrix 


H = gB-S 
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which involves the spin operator Sj := |crj defined in terms of Planck’s constant h and 
the three Pauli matrices 


CTi = 






and the magnetic moment g and the magnetic field B. The prefactor of the Pauli matrices 
are real, and thus H = H* is a hermitian matrix. 

For instance, assume B = (0,0, b) points in the X 3 -direction. Then spin-up and spin- 
down (seen from the Xg-direction) are the eigenvectors of 



i. e. = ( 1 , 0) and = (0, 1 ). The dynamical equation is the Schrodinger equation 
d 

= -ip[Q') = xpo&n. ( 9 . 1 . 1 ) 

at 

The vector space 'H = C^ becomes a Hilbert space if we equip it with the scalar product 

{xp,ip)c2 := 2 ipj^fj. 

;=i.2 

Moreover, the hermitian matrix H can always be diagonalized (cf. exercise 35-36), and 
the eigenvectors to distinct eigenvalues are orthogonal. The complex-valued wave function 
xp encapsulates probabilities: for any xp normalized to 1 = the probability to 

find the particle in the spin-up configuration is 

P(S 3 =T) = \xpi\^ = \{xppXp)\^ 

since xp-^ = (1,0). The above notation comes from probability theory and means “the 
probability of finding the random observable spin S 3 in the spin-t configuration +^”- 

The second exemplary quantum system describes a non-relativistic particle of mass m 
subjected to an electric field generated by the potential V. The classical Hamilton function 
h(q,p) = + is then “quantized” to 

h(x,-ifiVJ (-ihV+ V(x) 

by replacing momentum p by the momentum operator P = —iftV,^ and position q by the 
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Figure 9.1.1; Images of a low-intensity triple slit experiment with photons (taken from 
[Cro08]). 


multiplication operator Q = x.^ The hamiltonian is now an operator on the Hilbert space 
whose action on suitable vectors xp is 

[HxpXx) = (A^V))(x)-F V(x)i/)(x). 

2m 

Quantum particles simultaneously have wave and particle character: the Schrodinger 
equation (9.1.1) is structurally very similar to a wave equation. The physical constant h 
relates the energy of a particle with the associated wave length and has units [energy • 
time]. The particle aspects comes when one measures outcomes of experiments: consider 
a version of the Stern-Gerlach experiment where the intensity of the atomic beam is so low 
that single atoms pass through the magnet. If the modulus square of the wave function 
|'0(t,x)p were to describe the intensity of a matter wave, then one expects that the two 
peaks build up slowly, but simultaneously. In actuality, one registers single impacts of 
atoms and only if one waits long enough, two peaks emerge (similar to what one sees in 
a low-intensity triple slit experiment in Figure 9.1.1). This is akin to tossing a coin: one 
cannot see the probabilistic nature in a few coin tosses, let alone a single one. Probabilities 
emerge only after repeating the experiment often enough. These experiments show that 
|i/'(t,x)p is to be interpreted as a probability distribution, but more on that below. 

Pure states are described by wave functions, i. e. complex-valued, square integrable 
functions. Put more precisely, we are considering made up of equivalence classes 

^To find a consistent quantization procedure is highly non-trivial. One possibility is to use Weyl quantization 
[Wey27; Wig32; Moy49; Fol89; LeilO]. Such a quantization procedure also yields a formulation of a semi- 
classical limit, and the names for various operators (e. g. position, momentum and angular momentum) 
are then justified via a semiclassical limit. For instance, the momentum operator is -ihV^, because in the 
semiclassical limit it plays the role of the classical momentum observable p (cf. e. g. [LeilO, Theorem 1.0.1] 
and [LeilO, Theorem 7.0.1]). 
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of functions with scalar product 


W,'4’) 


dx (p(x) t/'Cx) 

Jr'' 


and norm ||'i/)|| := ^ {\p,xp). In physics text books, one usually encounters the the bra-ket 
notation: here ^xp) is a state and (xlt/i) is xp[x). The scalar product of e is 

denoted by {(f>\xp) and corresponds to (^, t/)). Although bra-ket notation can be ambigu¬ 
ous, it is sometimes useful and in fact used in mathematics every once in a while. 

The fact that consists of equivalence classes of functions is only natural from 

a physical perspective: if ~ xp 2 are in the same equivalence class (i. e. they differ 
on a set of measure 0), then the arguments in Chapter 4.1.2 state that the associated 
probabilities coincide: Physically, \xp[x, t)| is interpreted as the probability to measure a 
particle at time t in (an infinitesimally small box located in) location x. If we are interested 
in the probability that we can measure a particle in a region A c we have to integrate 
\xP(x,t)\^ over A, 


P(A(t) e A) 


' 

dx \xp(x,t)\^. 
J A 


(9.1.2) 


If we want to interpret bls probability density, the wave function has to be normalized, 

i. e. 


2 


dx IViCx)!^ 

Jm'' 


= 1 . 


This point of view is called Bom rule: ^xp^^ could either be a mass or charge density - or 
a probability density. To settle this, physicists have performed the double slit experiment 
with an electron source of low flux. If |'!/>| were a density, one would see the whole 
interference pattern building up slowly. Instead, one measures “single impacts” of elec¬ 
trons and the result is similar to the data obtained from experiments in statistics (e. g. the 
Dalton board). Hence, we speak of particles. 


9.2 The mathematical framework 

To identify the common structures, let us study quantum mechanics in the abstract. Just 
like in the case of classical mechanics, we have to identify states, observables and dynamical 
equations in Schrodinger and Heisenberg picture. 
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9.2 The mathematical framework 


9.2.1 Quantum observables 

Quantities that can be measured are represented by selfadjoint (hermitian in physics par¬ 
lance) operators F on the Hilbert space TL (typically i. e. special linear maps 

F : V[F) 

Here, V^F) is the domain of the operator since typical observables are not defined for all 
t/j e "H. This is not a mathematical subtlety with no physical content, quite the contrary: 
consider the observable energy, typically given by 

H = ;^(-iftV + y(x), 

2m 

then states in the domain 

X>(H) := |t/) e \ Hip e c 

are those of finite energy. For all ip in the domain of the hamiltonian VfH) c the 

expectation value 

{ip,Hip) < 00 

is bounded. Well-defined observables have domains that are dense in TL. Similarly, states 
in the domain I?(x;) of the Zth component of the position operator are those that are 
“localized in a finite region” in the sense of expectation values. Boundary conditions 
may also enter the definition of the domain: as seen in the example of the momentum 
operator on [ 0 , 1 ], different boundary conditions yield different momentum operators 
(see Chapter 5.3 for details). 

The energy observable is just a specific example, but it contains all the ingredients 
which enter the definition of a quantum observable: 

Definition 9.2.1 (Observable) A quantum observable F is a densely defined, selfadjoint 
operator on a Hilbert space. The spectrum criF) (cf. Definition 5.1.7) is the set of outcomes 
of measurements. 

Physically, results of measurements are real which is reflected in the selfadjointness of op¬ 
erators (cf. Chapter 5.4), H* = H. (A symmetric operator is selfadjoint if VfH*) = VfH).) 
The set of possible outcomes of measurements is the spectrum <j{H) c r (the spectrum is 
defined as the set of complex numbers so that H — z is not invertible, cf. Chapter 5.1.7). 
Spectra of selfadjoint operators are necessarily subsets of the reals (cf. Theorem 9.3.3). 
Typically one “guesses” quantum observables from classical observables: in d = 3, the 
angular momentum operator is given by 

L = X A (—ihVjf). 
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9 Quantum mechanics 


In the simplest case, one uses Dirac’s recipe (replace x by x and p by p = —ihV^) on the 
classical observable angular momentum L(x,p) = x Ap. In other words, many quantum 
observables are obtained as quantizations of classical observables: examples are position, 
momentum and energy. Moreover, the interpretation of, say, the angular momentum op¬ 
erator as angular momentum is taken from classical mechanics. 

In the definition of the domain, we have already used the definition of expectation 
value: the expectation value of an observable F with respect to a state ip (which we 
assume to be normalized, ||t/)|| = 1) is given by 

E^[F):={iP,FiP). (9.2.1) 

The expectation value is finite if the state ip is in the domain I?(F). The Born rule of 
quantum mechanics tells us that if we repeat an experiment measuring the observable F 
many times for a particle that is prepared in the state ip each time, the statistical average 
calculated according to the relative frequencies converges to the expectation value E^[F). 

Hence, quantum observables, selfadjoint operators on Hilbert spaces, are bookkeeping 
devices that have two components: 

(i) a set of possible outcomes of measurements, the spectrum cr(F), and 

(ii) statistics, i. e. how often a possible outcome occurs. 

9.2.2 Quantum states 

Pure states are wave functions ip ^FL, ox rather, wave functions up to a total phase: just 
like one can measure only energy differences, only phase shifts are accessible to measure¬ 
ments. Hence, one can think of pure states as orthogonal projections 

Pi, ■= 

if Ip is normalized to 1, ||tj)|| = 1. Here, one can see the elegance of bra-ket notation vs. 
the notation that is “mathematically proper”. A generalization of this concept are density 
operators p (often called density matrices): density matrices are defined via the trace. If 
p is a suitable linear operator and orthonormal basis of FL, then we define 

Trp :=^{(p„,p<p„). 

neN 

One can easily check that this definition is independent of the choice of basis (see home¬ 
work problem 28). Clearly, has trace 1 and it is also positive in the sense that 

W,Pi,F>) >0 

for all (p &FL. This is also the good definition for quantum states: 
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Definition 9.2.2 (Quantum state) ^4 quantum state (or density operator/matrix) p = p* 
is a non-negative operator of trace 1, i. e. 

{ip,pil))>0, 'itp&n, 

Trp = 1. 

If p is also an orthogonal projection, i. e. p^ = p, it is a pure state.^ Otherwise p is a mixed 
state. 

Density operators are projections if and only if they are rank-1 projections, i. e. p = \ip){'ip\ 
for some ip of norm 1 (see problem 28). 

Example Let ipj ^TL he two wave functions normalized to 1. Then for any 0 < a < 1 

P = aPxjj, + (1 - -I- (1 - a)\ip2){ip2\ 

is a mixed state as 

= a^\ipi){ipi\ + (1 - af\ip2){ip2\+ 

-f a(l - a)(fip^){ip^\\ip2){ip2\ + |i/’2)('</'2ll'0i)('0il) 

T^p. 


Even if xp^ and ip 2 are orthogonal to each other, since ^ a and similarly (1 — a)^ / 
(1 — a), p cannot be a projection. Nevertheless, it is a state since Tr p = a -I- (1 — a) = 1. 
Keep in mind that p does not project on aip^ -I- (1 — a)ip 2 \ 

Also the expectation value of an observable F with respect to a state p is defined in terms 
of the trace. 


Ep(F):=Tr(pF), 
which for pure states p = \ip){ip\ reduces to {ip,Fip). 

9.2.3 Time evolution 

The time evolution is determined through the Schrodinger equation, 

9 M II 

= Hip(t), ip(t)^n, ip(0) = ipo,\\ipo\\ = l. (9.2.2) 

^Note that the condition Tr p = 1 implies that p is a bounded operator while the positivity implies the selfad¬ 
jointness. Hence, if p is a projection, i. e. p^ = p, it is automatically also an orthogonal projection. 
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2014.02.06 


Alternatively, one can write %p[t) = U[t)'ipQ with 1/(0) = id^. Then, we have 
5 

ift—U(t) = HU(t), U(0) = id„. 

a t 

If H were a number, one would immediately use the ansatz 

= (9.2.3) 

as solution to the Schrodinger equation. If H is a selfadjoint operator, this is still true, 
but takes a lot of work to justify rigorously if the domain of H is not all of V. (the case of 
unbounded operators, the generic case). 

As has already been mentioned, we can evolve either states or observables in time and 
one speaks of the Schrodinger or Heisenberg picture, respectively. In the Schrodinger 
picture, states evolve according to 


t/'CO = 

while observables remain fixed. Conversely, in the Heisenberg picture, states are kept 
fixed in time and observables evolve according to 

F(t) := U[tTFU[t) = e+‘s"F e"‘i". (9.2.4) 

Heisenberg observables satisfy Heisenberg’s equation of motion, 

^F(t)=^[H,F(t)], F(0) = F, (9.2.5) 

which can be checked by plugging in the definition of F(t) and elementary/ormaZ ma¬ 
nipulations. It is no coincidence that this equation looks structurally similar to equa¬ 
tion (3.3.1) ! 

Just like in the classical case, density operators have to be evolved backwards in time, 
meaning that p(t) = U[t)p U[t)* satisfies 

^P(0 =-^ [H,p(t)], p(0) = p. 

The equivalence of Schrodinger and Heisenberg picture is seen by comparing expectation 
values just as in Chapter 3.4: the cyclicity of the trace, Tr(AB) = Tr(BA), yields 

Ep(,)(F) = Tr (p(t)) = Tr ([/(t)p [/(t)* f) 

= Tr (p U(tr F I/(t)) = Tr (p F(t)) = Ep (F(t)). 
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As a last point, we mention the conservation of probability: if i/’(t) solves the Schrodinger 
equation for some selfadjoint H, then we can check at least formally that the time evolu¬ 
tion is unitary and thus preserves probability, 

= (^Ht/)(t),i/)(t)) -I- {-ip[t), I^Hxpit)) 

= '-[{i>[t),H*xp[t)) - (t/.(t),Ht/>(t))) 

= '-{i,[tUH*-H)xp[t))=Q. 

Conservation of probability is reminiscent of Proposition 3.2.1. We see that the condition 
H* =H is the key here: selfadjoint operators generate unitary evolution groups. As a mat¬ 
ter of fact, there are cases when one wants to violate conservation of proability: one has 
to introduce so-called optical potentials which simulate particle creation and annihilation. 

The time evolution e“‘5^ is not the only unitary group of interest, other commonly 
used examples are translations in position or momentum which are generated by the 
momentum and position operator, respectively (the order is reversed!), as well as rotations 
which are generated by the angular momentum operators. 

9.2.4 Comparison of the two frameworks 

Now that we have an understanding of the structures of classical and quantum mechanics, 
juxtaposed in Table 9.2.1, we can elaborate on the differences and similarities of both 
theories. For instance, observables form an algebra (a vector space with multiplication): 
in classical mechanics, we use the pointwise product of functions, 

• : X ^ C“(r 2"), (/, g) • g 

(/ •g)U-P) ■=f^x,p)g[x,p), 

which is obviously commutative. We also admit complex-valued functions and add complex 
conjugation as involution (i. e. f** = /). Lastly, we add the Poisson bracket to make 
C“(R^") into a so-called Poisson algebra. As we have seen, the notion of Poisson bracket 
gives rise to dynamics as soon as we choose an energy function (hamiltonian). 

On the quantum side, bounded operators (see Chapter 5.1) form an algebra. This 
algebra is non-commutative, i. e. 


F-G^G-F. 

Exactly this is what makes quantum mechanics dijferent. Taking adjoints is the involution 
here and the commutator plays the role of the Poisson bracket. Again, once a hamiltonian 
(operator) is chosen, the dynamics of Heisenberg observables F(t) is determined by the 
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Classical 

Quantum 

Observables 

/ eC°°(R^",R) 

selfadjoint operators acting on 
Hilbert space H 

Building block 

position X and momentum p 

position X and momentum p 

observables 


operators 

Possible results of 
measurements 

im(/) 

a(F) 

States 

probability measures p on 
phase space R^" 

density operators p on % 

Pure states 

points in phase space R^" 

wave functions if gH 

Generator of evolution 

hamiltonian function 

H : R^" —> R 

hamiltonian operator H 

Infinitesimal time 
evolution equation 

4/(t) = {H,/(t)} 

^F(t)=i[H,F(t)] 

Integrated time 
evolution 

hamiltonian flow cpi 

e+‘i”ne“‘s^ 


Table 9.2.1; Comparison of classical and quantum framework 


commutator of the F(t) with the hamiltonian H. If an operator commutes with the hamil- 
tonian, it is a constant of motion. This is in analogy with Definition 3.3.3 where a classical 
observable is a constant of motion if and only if its Poisson bracket with the hamiltonian 
(function) vanishes. 

9.2.5 Representations 

Linear algebra distinguishes abstract linear maps H : A” —> y and their representations 
as matrices using a basis in initial and target space: any pair of bases and 

of X = and y = induces a matrix representation h = e Matc(iV,fC) of H 
(called basis representation) via 


K 

= y^. 

k=l 

The basis now identifies coordinates on the vector spaces: x = Xn=i ^ has the 

coordinate E, = E,^) e , and similarly y = Xifc=i Vkyk^y is expressed in terms 

of the coordinate 17 e C^. Using these coordinates, the equation Hx = y becomes the 
matrix equation hE, = r]. 

A change in basis can now be described in the same way: if and are 
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two other orthonormal bases, then the coordinate representations of the maps 

x' 

Uyy-.yk-^y'k 

are unitary matrices e U{C^) and Uyy> e and these matrices connect the 

coordinate representations of H with respect to {yk}^^i and {x^}^^j, 

h' = Uvv' hu~^,. 
yy XX 

u~^, maps onto h maps ^ onto rj and Uyyi maps 17 onto 17'. 

Similarly, we can represent operators on in^nite-dimensional Hilbert spaces such as 
in much the same way: for instance, consider the free Schrddinger operator H = 
-iA^ : V c —> L^CR'^). Then the Fourier transform T : —> L^(R'^) 

is such a unitary which changes from one “coordinate system” to another, and the free 
Schrddinger operator in this new representation becomes a simple multiplication operator 

= \ty. 

Because initial and target space are one and the same, T appears twice. 

Another unitary is a rescaling which can be seen as a change of units: for A > 0 one 
defines 


WxTXx) := A'^XAx) 

where the scaling factor A relates the two scales. Similarly, other linear changes of the 
underlying configuration space R'^ (e. g. rotations) induce a unitary operator on 2014.02.11 

One can exploit this freedom of representation to simplify a problem: Just like choosing 
spherical coordinates for a problem with spherical symmetry, we can work in a repre¬ 
sentation which simplifies the problem. For instance, the Fourier transform exploits the 
translational symmetry of the free Schrddinger operator (H commutes with translations). 

Another example would be to use an eigenbasis: assume H = H* > 0 as a set of 
eigenvectors {'4>n}neN which span all of TL, i. e. the are linearly independent and 
Hxp^ = £„ where £„ e R is the eigenvalue. The eigenvalues are enumerated by magni¬ 
tude and repeated according to their multiplicity, i. e. < E 2 < ■ ■.. Just like in the case 
of hermitian matrices, the eigenvectors to distinct eigenvalues of selfadjoint operators 
are trivial, and hence, we can choose the to be orthonormal. Then the suitable 

unitary is 


00 

U : TL —> Ip = ^^xp[n)xp^ r/j e 

71=1 
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where -0 = ('0(1), '0(2),...) is the sequence of coefficients and f ^(N) is the prototypical 
Hilbert space defined in Definition 4.2.7; moreover, the definition of orthonormal basis 
(Definition 4.2.3) implies that '0 is necessarily square summable. 

In this representation, H can be seen as an “infinite diagonal matrix” 


00 

fE^ 0 • • • 

...h 


0 £2 0 


n=l 

1 : 

■y 


where := (0, p) 0 are the rank-1 projections onto 0. Put another way, acts on 
0 e f ^(N) as 

H '' 0 = (£ i 0 ( 1),£2 0 ( 2 ),...). 

The simple structure of this operator allows one to compute the unitary evolution group 
explicitly in terms of the projections 

00 

n=l 

Sadly, most Schrodinger operators H do not have a basis of eigenvectors. 


9.3 Spectral properties of hamiltonians 

The spectrum of an operator is the generalization of the set of eigenvalues for matrices. 
According to Definition 5.1.7 the spectrum can be dived up into three parts, the point 
spectrum 


o'p(H) := {z G C I H — z is not injective}, 

the continuous spectrum 

:= {z G C I H — z is injective, im(H — z) c dense}, 
and the residual spectrum 

cjr(H) := {z G C I H — z is injective, im(H — z) c 7^ not dense}. 

Point spectrum is due to eigenvalues with eigenvector. Compared to matrices, the occur¬ 
rence of continuous and residual spectra is new. The residual spectrum is not important 
for our discussion as it is empty for selfadjoint operators. 
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The continuous spectrum can be attributed to cases where the eigenvectors are not 
elements of the Hilbert space. For instance, in case of the free Schrddinger operator 
H = on the spectrum is cj(H) = = [0,+oo). Here, the eigenvec¬ 

tors are plane waves, which are smooth, bounded functions; however, plane waves 
are not square integrable. Similarly, multiplication operators have Dirac distributions as 
eigen“functions”. 

Note that this distinction between the spectral components goes further than looking 
at the spectrum as a set: for instance, it is known that certain random Schrddinger op¬ 
erators have dense point spectrum which “looks” the same as continuous spectrum. The 
spectrum can be probed by means of approximate eigenfunctions (“Weyl’s Criterion”, see 
Theorem 5.2.4) . 

There is also a second helpful classification of spectrum which cannot be made rigor¬ 
ous with the tools we have at hand, and that is the distinction between essential spec¬ 
trum cjg 5 s(H) and discrete spectrum The essential spectrum is stable under local, 

short-range perturbations while the discrete spectrum may change. One has the following 
characterization for the essential spectrum: 

Theorem 9.3.1 (Theorem VII. 10 in [RS72] ) A e CTessCH) ijf one or more of the following 
holds: 

(i) Aeo-<.ont(H) 

(ii) Xis a limit point of a 
(in) Xis an eigenvalue of infinite multiplicity. 

Similarly, the discrete spectrum has a similar characterization: 

Theorem 9.3.2 (TheoremVII.il in [RS72] ) A e cr^i^^(H) if and only if both of the fol¬ 
lowing hold: 

(i) Xis an isolated point ofu(H\ i. e. for some e > Owe have {X — e,X-\-e) ncr(H) = {A}. 

(ii) Xis an eigenvalue of finite multiplicity. 


9.3.1 Spectra of common selfadjoint operators 

Quite generally, the spectrum of selfadjoint operators is purely real. But before we prove 
that, let us discuss some examples from physics: 
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Multiplication operators The spectrum of the multiplication operator 

(/(x)'i/))(x) :=/(x)t/)(x) 


is given by the range, cj(/(x)) = ran/, where / : —> R is a piecewise-continuous 

function/ 

To see this claim, we rely on the Weyl criterion: in order to show cr(/(x)) 2 ran/, 
pick any A e ran/. Then there exists a sequence x„ such that |A —/(Xn)| < i/"- Then 
by shifting an -Dirac sequence by x„ (e. g. scaled GauUians), we obtain a sequence of 
vectors with || (/(x) — ^ 0. Hence, this reasoning shows ran/ c ct(/(x)). 

To show the converse inclusion, let A e Then there exists a Weyl sequence 

{'H’JneN with II (/(x) - X)%pn\\ ^ 0 as n ^ 00 . Assume in4gj|.d |/(x) — A| = c > 0, 
i. e. A ^ ran/, then {ip^} cannot be a Weyl sequence to A, 

||(/(x)-A)t/„|| > infJ/(x)-A|||t/)J| >c>0, 


which is absurd. 

Should / be constant and equal to Ag on a set of positive measure, there are infinitely 
many eigenfunctions associated to the eigenvalue Ag. Otherwise, / has continuous spec¬ 
trum. In any case, the spectrum of/(x) is purely essential. 

Clearly, this takes care of any operator which is unitarily equivalent to a multiplica¬ 
tion operator, e. g. the free Laplacian on R‘*, or the tight-binding hamiltonians from 
Chapter 6.1.6.2. 


The hydrogen atom One of the most early celebrated successes of quantum mechanics is 
the explanation of the spectral lines by Schrddinger [Sch26b; Sch26d; Sch26a; Sch26c]. 
Here, the operator 



e 



acts on a dense subspace of L^(R^). A non-obvious fact is that this operator is bounded 
from below, i. e. there exits a constant c > 0 such that H > —c. This is false for the 
corresponding classical system, because the function h[q,p) = ~ iff bounded 

from below. 

The reason for that is that states of low potential energy (i. e. wave functions which are 
sharply peaked around 0) must pay an ever larger price in kinetic energy (sharply peaked 

^This condition can be relaxed and is chosen just for ease of use. 
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means large gradient). One heuristic way to see that is to compute the energy expectation 
value of 'xlix '■= for A » 1 where ip e 


^ -e (^A>l^l Va) 


2m 

= —A^ 
2 m 


m? 


dxA^ V^t/i(Ax)|^ — e A 


3 


dx 


A|x| 




Clearly, if one replaces the Coulomb potential by — |x|“^, the kinetic energy wins and the 
quantum particle can “fall down the well”. 

The negative potential gives rise to a family of eigenvalues (the spectral lines) while 
—contributes continuous spectrum [ 0 , +oo), 


Cr(H) = {£n}ngN U [0> +00), 
^COntCff) [d, “boo) Crggg(f/), 

^pCff) {■^n}neN ^discC-^)- 


9.3.2 The spectrum of selfadjoint operators is real 

As a side note, let us show that the spectrum of selfadjoint operators is purely real. 

Theorem 9.3.3 Let H = H* be a selfadjoint operator on the Hilbert space H. Then the 
following holds true: 

(i) cr(H)C]R 

(ii) H > 0 ^ cr(H) C [0,-|-oo) 

To prove this, we use the following 

Lemma 9.3.4 Let Tip j = 1,2, be Hilbert spaces. Then an operator T e B(Tii,Ti 2 ) is 
invertible if and only if there exists a constant C > 0 such that T*T > C id^^ and T T* > 
C id^^ hold. 

Proof Assume T is invertible. Then T* : Tix —> Tii is also invertible with inverse 
= r“^*. Set C := ||r“^|| Then the inequality 

||t/.|| = ||r-irt/>||<||r-i|| \\tiP\\ 
proves ||r'!/)||>||r“^|| and thus also 

{iP,T*TiP) = \\Tipf > ||t-i||“' \\ipf = C \\ipf, (9.3.1) 
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i. e. we have shown T*T >C The non-negativity of T T* is shown analogously. 

Suppose there exists C > 0 such that T*T >C id^^ and T T* >C id^^. Then from 
(9.3.1) we deduce ||Tt/)|| > Vc llt/jH holds for all ip e Tfi. First of all, this proves that T 
is injective, and secondly T has closed range in "^2 (one can see the latter by considering 
convergence of Txp^ for any Cauchy sequence {ipn}neN')- Moreover, one can easily see 

ran T = ran T = (kerT*)"^. 

Since we can make the same arguments for T*, we also know that T* is injective, and thus 
ker T* = {0}. This shows that T is surjective, i. e. it is bijective, and hence, invertible. □ 

With the proof of the Lemma complete, we can now prove the statement: 

Proof (Theorem 9.3.3) (i) Let H = H* he selfadjoint and z = A-Fi/reC\]Rbea 

complex number with non-vanishing imaginary part p. We will show that z ^ o'(H), 
i. e. that H — A is invertible: a quick computation shows 

(H - z)* (H - z) = - 2 (Rez)H -F \z\^ = - 2AH -F (A^ -F 

The last term is non-negative, and thus, we have shown 

(H-z)*(H-z) 

By the Lemma, this means H — A is necessarily invertible, and z ^ ct(H). 

(ii) We have to show that for A e (—oo, 0), the operator H — A is invertible. This follows 
as before from 

(H - A)* (H - A) = - 2AH -F A^ > A^ 

the non-negativity of —2AH = 2|A| H and the Lemma. □ 


9.3.3 Eigenvalues and bound states 

The hydrogen atom is a prototypical example of the type of problem we are interested in, 
namely Schrbdinger operators on L^(]R‘^) of the form 

H = -A, + V 

where V < 0 is a non-positive potential decaying at infinity (lim|^|_oo = 0). LFnder 
suitable technical conditions on the potential, H defines a selfadjoint operator which is 
bounded from below, that is H > c holds for some c e R, and we have 


= o-(-A^) = [0,-Foo). 
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Now the question is whether cTp(H) = 0 or 

o-p(H) = c (- 00 , 0 ) 

for some iV e Nq U {oo}. We shall always assume that the eigenvalues are ordered by 
magnitude, 


Eq < El < ... 

The ground state ipQ is the eigenfunction to the lowest eigenvalue Eq. Eigenfunctions xp 
are localized: the weakest form of localization isxp & L^(R‘*), but usually one can expect 
exponential localization. 

So there are two natural questions which we will answer in turn: 

(1) Do eigenvalues below the essential spectrum exist? 

(2) Can we give estimates on their numerical values? 

9.3.3.1 The Birman-Schwinger principle 

We begin with the Birman-Schwinger principle which gives a criterion for the existence 
and absence of eigenvalues at a specific energy level. It is the standard tool for showing the 
existence or absence of eigenvalues. Assume ip is an eigenvector of H to the eigenvalue 
—E < 0. Then the eigenvalue equation is equivalent to 

{-A, + E)^ = -V^ = \V\^. 

If we define the vector xp := | and use that —E ^ cr(—A^) = [0, -boo), we obtain 

\V\^^^i-A, + Ey^\V\^^^xP = xp. 

In other words, we have just shown the 

Theorem 9.3.5 (Birman-Schwinger principle) The function ip e L^(R‘*) is an eigenvector 
ofH = — -b V to the eigenvalue —E<0 if and only if xp = \vy ip is an eigenvector of the 
Birman-Schwinger operator 

Ke ■■= |Vr^^(-A, + E)“' \vy (9.3.2) 


to the eigenvalue 1. 

The only assumption we have glossed over is the boundedness of Kg. One may think 
that solving KexP = xp is just as difficult as Hip = —Eip, but it is not. For instance, we 
immediately obtain the following 


2014.0213 
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Corollary 9.3.6 Assume the Birman-Schwinger operator e S(L^(]R‘*)) is bounded. Then 

for Aq small enough, + XV has no eigenvalue at —E for allO < X < Xq. 

Proof Replacing V with XV in equation (9.3.2) yields that the Birman-Schwinger operator 
for Hi is XKg. Thus, for X small enough, we can make X ||ffE|| < 1 arbitrarily small and 
since sup |CT(ffg) | < this means 1 cannot be an eigenvalue. Hence, by the Birman- 

Schwinger principle there cannot exist an eigenvalue at — £. □ 


Another advantage is that we have an explicit expression for the operator kernel of K^, 
the Birman-Schwinger kernel, which allows us to make explicit estimates. In general, an 
operator kernel Kj for an operator T is a distribution on ]R‘* x R'* so that 


r 


(T-ip)ix) = 


Jr'‘ 


dyKjix,y)%piy). 


For the sake of brevity, we will also write T(x,y) for Kj[x,y). We have dedicated Chap¬ 
ter 8 to one specific example: assume the operator L is invertible and Lu = f, then 


u(x) = 




dyG(x,y)/(y)=(L-V)(x) 


holds. In other words, the Green’s function G is the operator kernel of L“^. 

Seeing as is the product of the multiplication operator | and (—-F E) the 
dimension-dependent, explicit expression of Birman-Schwinger kernel involves only the 
Green’s function of — A,^ -F E in that particular dimension, 

KE[x,y) = |nx)r^^ (-A^ + Ey\x,y)\v[y)f^". 

In odd dimension, there exist closed expressions for (—A^ -F E) ^(x,y) while for even d, 
no neat formulas for it exist. Nevertheless, its behavior can be characterized. 

Let us return to the original question: Gan we show the existence of eigenvalues as well via 
the Birman-Schwinger principle? The answer is yes, and we will treat a particular case: 

Theorem 9.3.7 ([ Sim76] ) Consider the Schrodinger operator = —d^ -F XV on L^(R) 
where A > 0 and the potential satisfies V e E^(R), V / 0, V < 0, and 

' 

dx (l -F x^) I V(x)| < 00. 

Jr 

^This is a general fact: if T e B{X) is an operator on a Banach space, then supItjCT)! < ||T|| holds [YosSO, 
Chapter VIII.2, Theorems 3 and 4]. 
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Then there exists Ag > 0 small enough so that has exactly one eigenvalue 


Ex 



dx |y(x)| 


2 

+ 0(A4) 


(9.3.3) 


for all A e (0, Ag). 

The eigenvalue gives an intuition on the shape of the eigenfunction: it has few oscilla¬ 
tions to minimize kinetic energy and is approximately constant in the region where V is 
appreciably different from 0 (this region is not too large because of the decay assump¬ 
tion Jj^dxx^ < oo). Hence, the eigenfunction sees only the average value of the 

potential. 

This intuition neither explains why other eigenvalues may appear nor that for d > 3, 
the theorem is false. 

Proof The arguments in [Sim76, Section 2] ensure the boundedness of the Birman- 
Schwinger operator. Moreover, in one dimension the Green’s function for —d^ + E exists 
(—£ ^ C7(—d^)) and can be computed explicitly, namely 


i~^x+E) ^[x,y)= +E) ^)(x-y) = 


Q-fE\x-y\ 

2/E 


To simplify notation, let us define p := ^/E. Thus, the Birman-Schwinger kernel is the 
function 


K^,2ix,y) 


2p 




In addition, define the operators 


L,:=^\\V\y^){\V\y^\ 

and := — L^. Clearly, given that V e T^(IR), its square root is and is a 

bounded rank-1 operator. Moreover, the operator kernel 

M^(x,y)= IVCx)!'/^ ^ ~ ^ |y(y)|'^' 

is well-defined in the limit p—*0 and analytic for p e C with Re p > 0. 

The Birman-Schwinger principle tells us that has an eigenvalue at —p^ if and only 
if 1 e cjp(fC^ 2 ): for A ^ 1 small enough we have ||AMp|| < 1 which means the Neumann 
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series^ 


2014.02.25 


(1 - = 1 + + 0(A2) 


(9.3.4) 


exists in . Hence, the invertibility of 

1 — A K^2 = 1 — A — A 

= (l-AM^) (i-A(I-AM^)-'lJ 

hinges on whether 1 is an eigenvalue of 

A (1 - AM^)“' = I ^(1 - AM^)“' (ivr^^ . 

This is again a rank-1 operator, and thus, we can read off the eigenvector 

to its only non-zero eigenvalue. Moreover, we can compute this eigenvalue, 

and this is equal to 1 if and only if p satisfies the self-consistent equation 


M=G(m):=^ {\V\"I\ (1-AM^) ' |vr^^). 


Given that ||AM^|| < 1 for A ^ 1 small enough, we can express (l — AM^) ^ in terms 
of (9.3.4) . Keeping only the first term of the expansion (9.3.4) , we approximate G by the 
average of the potential 




A 

2 


' 

dx |V(x)|-b C>(A2). 


(9.3.5) 


Hence, G(/r) = p has a solution provided A is small enough; additionally any solution 
to this equation satisfies < Gj A“^ for some constant Gj > 0 and A small. 

Now that we know that a solution exists, we need to show uniqueness: Suppose we 
have found two solutions /ij < fjL 2 - Then they both solve the self-consistent equation 

^In this context, the geometric series is usually referred to as Neumann series. 
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G{pj) = Pj, and assuming for a moment that G is continuously differentiable in p, we use 
the fundamental theorem of calculus to obtain 


P2 Pi 


= |g(m2)-G(mi)| 


dp a^G(/i) 

JjUi 


< sup |S^G(p)| |p 2 -Mi 

^^[^ 1 ,^ 2 ] 


If we can show G is continuously differentiable and its derivative is bounded by 1/2 for A 
small enough, then the above inequality reads 1 1 U 2 ~ Mi | — 11 M 2 ~ Mi | • This is only possible 
if Ml = M 2 > the solution is unique. 

To show the last bit, we note that and (1 — z)~^ are real-analytic in p so that their 
composition (l — ^ is also real-analytic. The analyticity of for p&C,Rep> 0, 

also yields the bound 

||5^mJ|<C2M“' (9.3.6) 

via the Cauchy integral formula, because the maximal radius of the circular contour is less 
than p. 

The derivative of the resolvent can be related to via the useful trick 


0 = 5^(id) = 5^((l-AM^) i(l-AM^)) 

= 5^(1 - AM^)“' (1 - AM^) + A (1 - AM^)“' 


which yields 


5i*G(m)| 


Y 




The right-hand side can be estimated with the help of the Cauchy-Schwarz inequality 


... < A^ 


'Ml 


|l 2 (r) 


( 1 -AM^) 


-1 


\d^MJ\=:C2X^\\d^Mj. 


Combining (9.3.6) with p ^ < Gj A ^ (which we obtained from p = G[p)l, we find 

C 3 A2 ||3^mJ < C 3 A2 G 2 m“' < C 1 G 2 C 3 A. 

Put another way, we have deduced the bound |5 ^G(m)| < G A which means that for A 
small enough, we can ensure that the derivative is less than 1 / 2 - Thus, the eigenvalue is 
unique and we have shown the theorem. □ 
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9.3.3.2 The min-max principle 

Now that we have established criteria for the existence of bound states below the contin¬ 
uous spectrum for operators of the form H = — -I- V, we proceed to find other ways 

to give estimates of their numerical values. Crucially, we shall always assume H > c for 
some c G R. Most of the methods of this chapter do not depend on the particular form of 
the hamiltonian. 

So let us assume we have established the existence of a ground state xpQ, i. e. there exists 
an eigenvalue Eq = inf (7(H) < 0 = inf at the bottom of the spectrum, the ground 

state energy, whose eigenfunction is xpQ. Then simplest estimate is obtained by minimizing 
the Rayleigh quotient 






for a family of trial wave functions (see also homework problem 54). Clearly, the Rayleigh 
quotient is bounded from below by Eq for otherwise, Eq is not the infimum of the spectrum. 

Proposition 9.3.8 (The Rayleigh-Ritz principle) LetH with a densely defined, selfadjoint 
operator which is bounded from below, i. e. there exists c G R such that H >c. Then 



(9.3.7) 


holds for all 'ip ^TLX {0}. 

A rigorous proof of this innocent-looking fact (see e. g. [RS78, Theorem XIII.1]) requires 
machinery that is not yet available to us. 

A non-obvious fact is that we can also give a lower bound on the ground state energy: 

Theorem 9.3.9 (Temple’s inequality. Theorem XIII.5 in [RS78]) Let H be a selfadjoint 
operator that is bounded from below with ground state Eq g c7p(H), Eq < 0. Suppose in 
addition Eq < E^ where Ei is either the second eigenvalue (in case more eigenvalues exist) 
or the bottom of the essential spectrum. Then for p G (Eq,Ei) and xp with ||t/)|| = 1 and 
{xp,Hip) < p, Temple’s inequality holds: 



Temple’s inequality gives an energy window for the ground state energy: if xp is close to 
the ground state wave function, then the right-hand side is also close to Eq. On the other 
hand, one needs to know a lower bound on the second eigenvalue E^. 
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Proof By assumption, Eq is an isolated eigenvalue of finite multiplicity (otherwise Eq = 
El = £„ for all n e N), and thus the operator (H — EqXH — p) > 0 is non-negative: 
the product is = 0 if applied to the ground state and > 0 otherwise because p < E^. 
Consequently, 

{i,XH-Ei)Hxl,)>Eo{^PXH-p)xp) (9.3.8) 

holds which, combined with the hypothesis (t/j, (H — p)'4’) < 0 , yields 

p {tp,Hxl)) - 

p-{ip,Hip) ■ ° 

What about other bound states below the essential spectrum (the ionization threshold)? 
Usually, we do not know whether and how many eigenvalues exist. Nevertheless, we can 
define a sequence of non-decreasing real numbers that coincides with the eigenvalues if 
they exist: the Rayleigh quotient suggests to use 

Eq ■■= inf 

as the definition of the ground state energy. Note that even if H does not have eigenvalues, 
Eq is still well-defined and yields infcrCH) (use a Weyl sequence). A priori, we do not 
know whether a Eq is an eigenvalue, so we do not know whether an eigenvector exists. 
However, if Eq is an eigenvalue, then the eigenvector ipi to the next eigenvalue E^ (if it 
exists) would necessarily have to be orthogonal to ipQ. Then the next eigenvalue satisfies 

Ei= sup inf ((p,Hip). 

¥)„en(H)\{0} ¥>eX)(H),||v)||=l 

It turns out that this is the good definition even if Eq ^ is not an eigenvalue 

of finite multiplicity, because then Eq = E^. Quite generally, the candidate for the nth 
eigenvalue is 

£„ := sup inf ((p,H<p). 

{Vj,Vk)=5jk 

Thus, we obtain a sequence of non-decreasing real numbers 

Eq < El < E2 < ■ ■ ■ 

which - if they exist - are the eigenvalues repeated according to their multiplicities. One 
can show rigorously that if £„ = = £^+2 = •••> thon £„ = inf 0 - 555 (H) is tho bot¬ 

tom of the essential spectrum. Otherwise, the £„ < infOe 55 (H) are eigenvalues of finite 
multiplicity. In that case, there are at most n eigenvalues below the essential spectrum. 
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2014.02.27 


One may object that quite generally, it is impossible to evaluate Here is where the 
min-max principle comes into play: assume we have chosen n trial wave functions. Then 
this family of trial wave functions is a good candidate for the first few eigenfunctions if 


the eigenvalues Xj of the matrix h := 




(ordered by size) are close to 


the Ej. 


Theorem 9.3.10 (The min-max principle) Suppose H is a self adjoint operator on the Hilbert 
space H with domain 'D[Hj. Moreover, assume H is bounded from below. Let { ipg,..., ‘fn-i } ^ 
'D(H) be an orthonormal system of n functions and consider the nx n matrix 




with eigenvalues Aq < < ... < An_]^. Then we have that 


Ej < Xj V; = 0,..., n — 1. 

Proof We proceed by induction over k (which enumerates the eigenvalues of h): denote 
the normalized eigenvector to the lowest eigenvalue Aq with Vq = (vo_o, •.. Vo,n-i) • Then 
the normalized vector Xo ■= 2j=o ^o.j T’j satisfies 

Aq = (vo,hvo)c» = {Xo,HXo) > Eq 
by the Rayleigh-Ritz principle. 

Now assume we have shown that Ei < A; holds for all Z = 0,..., Zc < n — 2. Clearly, 
the eigenvectors VQ,...,Vj. to h, and the space spanned by the corresponding normalized 
Xi = 2j=o T’j Is k+ 1-dimensional. Hence, for any 


71-1 

X = ^WjXj ^{Xo,---,Xk}^ 

j=0 

with coefficients w e {vq, ..., we obtain 

{w,hw) = {x,Hx) >Ek+-L 

because x is orthogonal to a Zc -I-1-dimensional subspace of I?(H). The left-hand side can 
be minimized by setting w = the eigenvector to A^^-mi thus, This 

concludes the proof. □ 

One can use the min-max principle to make the following intuition rigorous: Assume one 
is given an operator H(V) = — A^. -I- V whose potential vanishes sufficiently rapidly at oo, 
and one knows that H(V) has a certain number of eigenvalues {Ej(y)}j^x> T c Hq. The 
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decay conditions on V ensure )) = [0, +oo). Then if W < V is a second potential 

of the same type, the min-max principle implies 

E^[W) < Ej[V). 

In particular, H(W) has at least as many eigenvalues as H(V). This fact combined with 
Theorem 9.3.7 immediately yields 

Corollary 9.3.11 Suppose we are in the setting of Theorem 9.3.7. Then for all 7 > 0 the 
Schrodinger operator H = —d^ + 7V has at least one eigenvalue Eq < 0. 

9.4 Magnetic fields 

Classically, there are two ways to include magnetic fields (cf Chapter 3.5); either by 
minimal substitution p p — A[x) which involves the magnetic vector potential A or one 
modifies the symplectic form to include the magnetic field B = y. A. Note that the 
physical observable is the magnetic field rather than the vector potential, because there 
are many vector potentials which represent the same magnetic field. For instance, if ^4 is 
a vector potential to the magnetic field B = x A, then also Al = A+V^rCj) is another 
vector potential to B, because V^. x = 0. The scalar function (f) generates a gauge 
transformation. 

In contrast, one always needs to choose a vector potential in quantum mechanics, 
and the hamiltonian for a non-relativistic particle subjected to an electromagnetic field 
(B,B) = (—V^V, Vj. X A) is obtained by minimal substitution as well, 

= (-iv^ -A)^ + V. (9.4.1) 

What happens if we choose an equivalent gauge A' = A+V^tj)? It turns out that and 
luiiiQjpiy equivalent operators, and the unitary which connects the two is e“*'^, 

e+i'f 

Using the lingo of Chapter 9.2.5, e““^ is a unitary that connects two different representa¬ 
tions. This has several very important ramifications. The spectrum for instance, 

only depends on the magnetic field B = Vy. y A because unitarily equivalent operators 
necessarily have the same spectrum. Moreover, the gauge freedom is essential to solving 
problems, because some gauges are nicer to work with than others. One such condition is 
• A = 0, known as Coulomb gauge. 

The natural domain of these operators are 


2014.03.04 
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9 Quantum mechanics 


Definition 9.4.1 (Magnetic Sobolev spaces Suppose A e Then we 

define the magnetic Sobolev space of order m to be 

:= {xP ^ \ < 00 } (9.4.2) 

where the mth magnetic Sobolev norm is 

S (9.4.3) 

|r|<m 


For A= 0, we abbreviate the (ordinary) Sobolev space with := 

The definition just means we are looking at those ip e L^(R'^) whose weak derivatives of 
up to mth order are all in L^(R‘^)f One can see that Sobolev spaces are complete and can 
be equipped with a scalar product (see e. g. [LLOl, Theorem 7.3] for the case m = 1 and 
A=Q). 

Magnetic fields have the property that they induce oscillations, and these induced oscil¬ 
lations, in turn, increase the kinetic energy. The diamagnetic inequality makes this intuition 
rigorous: 

Theorem 9.4.2 (Diamagnetic inequality) Let A : be in C^(R‘*,R‘^) and ip be 

in HJ(R'^). Then \ip^, the absolute value of ip, is in H^(R'^) and the diamagnetic inequality, 

I V:, \ip\ (x)| < I (-iV^ -A(x))ip(x)\, (9.4.4) 

holds pointwise for almost all x e R'^. 

Proof Since ip e L^(R‘^) and each component of A is in C^(R‘^,R‘^) c Lj^^^(R‘*,R‘*), the 
distributional gradient of ip is in Lj‘^^(R‘*). The distributional derivative of the absolute 
value can be computed explicitly, 

0 ipM = 0 

and the right-hand side is again a function in Lj^^^(R‘^). Given thati4 and \ip \ are real. 


dxj\'4’\M= ^ 


Re 


^ t/>(x) 


iAj(x)ip(x) 


Re(iAj(x) \ip(x)\) =0, 


®The weak derivative is well-defined, because we can view I^(R'*) as a subspace of the tempered distributions 
^'CR"*). 
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and we can add this term to equation (9.4.5) free of charge to obtain 

Re + iAj(x)) t/)(x)] t/)(x) ^ 0 

0 t/^Cx) = 0 

The diamagnetic inequality now follows from |Rez| < \z\, z e C. The left-hand side of 
(9.4.4) is in L^(R‘*) since the right-hand side is by assumption on xp. □ 

The simplest example of a magnetic Hamilton operator = (—iV^ —A)^ is the so-called 
Landau hamiltonian where d = 2, B is constant and V = 0. For instance, one can choose 
the symmetric gauge 

( 9 . 4 . 6 ) 

or the Landau gauge 

A(x} = B j (9 4 7) 

The spectrum of = {2n -I-1 | n e Nq} = are the Landau levels, a collection 

of infinitely degenerate eigenvalues accumulating at -l-oo. Physically, the origin for this 
massive degeneracy is translation-covariance: if I have a bound state xpQ, then xPq[- — Xq) 
is an eigenvector to a possibly gauge-transformed hamiltonian From a classical 

perspective, the existence of bound states as well as translational symmetry are also clear: 
a constant magnetic field traps a particle in a circular orbit, and the analog of this classical 
bound state is a quantum bound state, an eigenvector. 


W= ^ 


9.5 Bosons vs. fermions 

The extension of singZe-particle quantum mechanics to multi-particle quantum mechanics 
is highly non-trivial. To clarify the presentation, let us focus on two identical particles 
moving in R‘^. Two options are arise: either the compound wave function T' is a function 
on R'*, i. e. it acts like a density, or it is a function of R‘* x R'* where each set of coordinates 
X = (xi,X 2 ) is associated to one particle. It turns out that wave functions depend on R^^* 
where N is the number of particles. 

However, that is not all, there is an added complication: classically, we can label 
identical particles by tracking their trajectory. This is impossible in the quantum frame¬ 
work, because the uncertainty principle forbids any such tracking procedure. Given that 
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|«'(Xi,X2)|^ is a physical observable, the inability to distinguish particles implies 

|^Ui,^2)r = 

and hence, ^'(xi,X 2 ) = e"*"'® ^'(x 2 ,Xi). However, given that exchanging variables twice 
must give the same wave function, the only two admissible phase factors are e"*"'® = ±1. 

Particles for which ^'(xi,X 2 ) = ^'(x 2 ,Xi) holds are boson (integer spin) while those 
for which ^'(xi,X 2 ) = —^'(x 2 ,Xi) ate fermions (half-integer spin). Examples are bosonic 
photons and fermionic electrons. This innocent looking fact has very, very strong con¬ 
sequences on the physical and mathematical properties of quantum systems. The most 
immediate implication is Pauli’s exclusion principle for fermions, 

T'(x,x) = 0, 

a fact that is colloquially summarized by saying that bosons are social (because they like 
to bunch together) while sociophobic fermions tend to avoid one another. 

To make this more rigorous, let us consider the splitting 

L^(R‘^ X R‘*) = X R'*) e Lf^(R‘* x R'^) 

into symmetric and antisymmetric part induced via f = fs+ fas where 

/s(Xi,X2) := ^(/(Xi,X2)-|-/(X2,Xi)), 

/asUl,^2) := lifiXi,X2)-fiX2,X^)). 

Then one can proceed and restrict the two-particle Schrddinger operator 

J=1.2 

to either the bosonic space L^(R‘* x R'^). The kinetic energy — Xj=i 2 ^^j preserves the 
(anti-)symmetry, e. g. in the antisymmetric (fermionic case) it defines a bounded linear 
map 


H : Lf,(R‘* X x R'*) —> x R''). 

9.6 Perturbation theory 

One last, but important remark concerns perturbation theory. Almost none of the systems 
one encounters in “real life” has a closed-form solution, so it is immediate to study per¬ 
turbations of known systems first. The physics literature usually contents itself studying 
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approximations of eigenvalues of the hamiltonian, but the more fundamental question is 
what happens to the dynamics? In other words, does H 2 imply e““^' The 

answer is yes and uses a very, very nifty trick, the Duhamel formula. The idea is to write 
the difference 


e-itHi _ 


ds — f e-“"i 

ds V J 


r f 


ds e"^"' (Hi - H 2 ) e 


-i(t-s)H2 


(9.6.1) 


as the integral of a total derivative. So if we assume H 2 = Hj + e W, then one has to 
estimate 


e-i^Hi (Hi -H2) = C>(e). 

Note that this holds for all times, because quantum mechanics is a linear theory. Other¬ 
wise, we would have to use the Gronwall Lemma 2.2.6 that places restrictions on the time 
scale for which 'ipi^t) = and 1 / 12 ( 1 ) = remain close. 
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Chapter 10 

Variational calculus 


Functionals 8 ■. ^ X —> C are maps from a subset of a Banach space A’ over the 

field C (or R) to C (or R). In case X is finite-dimensional, a functional is just a function 
C" —> C, and so the cases we are really interested in are when X is in^nite-dimensional. 

Functionals arise very often in physics as a way to formulate certain fundamental princi¬ 
ples (e. g. energy, action and the like); their analysis often produces linear and non-linear 
PDFs which are interesting in their own right. For instance, the energy functional 





' 

dx r|v^'0(x)|^-Fy(x)|'0(x)H 


( 10 . 0 . 1 ) 


associated to the Schrodinger operator H = — -F V can be seen as a map H^(R‘^) 

8{xp) e R. Here, H^(R‘*) is the first Sobolev space, cf. Definition 9.4.1. Let us assume H 
is selfadjoint, bounded from below and has a ground state xpQ. Then if we minimize 8 
under the constraint ||t/)|| = 1, the functional has a global minimum at xpQ. Alternatively, 
we can view it as the minimizer of the Rayleigh-Ritz quotient. 

So let us perturb the Rayleigh-Ritz quotient in the vicinity of the minimizer tpQ, i. e. we 
define 


F(s) := 8[xPo+s<f) > F(0) = E((^o) 

where e H^(R‘*) is arbitrary, but fixed. One can express denominator and numerator 
explicitly as quadratic polynomials in s, and one finds 

^F(O) = 2 \\xPo\\~^ Re (<P, (-A, + V - Fq) V-o) = 0, 

i. e. F has a global minimum at 0 independently of the choice of function ip. Put more 
succinctly: if it exists, the minimizer xpQ of the Rayleigh-Ritz quotient is the eigenfunction 
of the Schrodinger operator H = — -F V at Eq = iniu[H). 
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The energy functional only serves as an amuse gueule. Among other things, it sug¬ 
gests that one can ask the same questions for functionals that one can asks for (ordinary) 
functions: 

(1) Existence of local and global extrema, convexity properties and the existence of ex¬ 
trema under constraints. 

(2) One can study ODEs where the vector field is connected to derivatives of a functional; 
in the simplest case, we want to look at gradient flows. The same fundamental ques¬ 
tions arise: Where are the fixed points? Are these fixed points stable or unstable? 

10.1 Extremals of functionals 

Given that functionals are just functions on infinite-dimensional spaces, it is not surprising 
that the same type of questions are raised as with ordinary functions: continuity, differen¬ 
tiability, existence of local and global extrema. As one can guess, a rigorous treatment of 
functionals is a lot more technical. 

10.1.1 The Gateaux derivative 

Apart from continuity, the most fundamental property a function has is that of differentia¬ 
bility. On functionals, the starting point is the directional derivative which then gives rise 
to the Gateaux derivative. Similar to functions on R" vs. functions on C", we start with 
the real case first and postpone the discussion of complex derivatives to Chapter 10.1.6. 

Definition 10.1.1 (Gateaux derivative) Let E -. LI <z X —> R a continuous linear func¬ 
tional defined on an open subset Q of the real Banach space X. Then the Gateaux derivative 
d£(tp) at %p is defined as the linear functional on X for which 



( 10 . 1 . 1 ) 


s=0 


holds for all & X. If the Gateaux derivative exists for all 'ip ^ LI, we say E is C^. 

Higher derivatives are multilinear forms which are defined iteratively. 

10.1.2 Extremal points and the Euler-Lagrange equations 

Gritical points are those X for which the Gateaux derivative dE[tp^:) = 0 vanishes. To 

illustrate the connection between PDEs and critical points, let us consider the functional 



r 


166 






10.1 Extremals of functionals 


where / : —> R is some fixed function and u e C^CR^*) c H^CR'^.R). A quick compu¬ 

tation yields the Gateaux derivative, and if we set the right-hand side of 


d£(u)v 


dx (V^u-V^v+fv) 

JR^ 

dx(-A^u-|-/)v = 0 

Jr"* 


( 10 . 1 . 2 ) 


to zero for all v e H^CR'^.R), we obtain the condition for a local extremum: u is a critical 
point if and only if u satisfies the Poisson equation 


-A^u-l-/ = 0. (10.1.3) 

Depending on the context, we call either (10.1.2) or (10.1.3) the Euler-Lagrange equation 
to £. So if df is comprised of functions, then the search for critical points is equivalent to 
solving a linear or non-linear PDE. 


10.1.3 Functionals on submanifolds 


Very often the functional is defined on a subset c df which lacks a linear structure 
(e. g. Lagrangian mechanics below) so that £[xp -f sip) need not make sense, because 
tp -f s(f ^ Q. 

Instead, one has to replace the simple linear combination tp -f sp> with differentiable 
paths (—5,-l-5) ^ s tpg & Q. Then a tangent vector ^ at i/) is an equivalence class of 
paths so that ipQ = tp and the tangent space is then the vector space 

spanned by these tangent vectors. Hence, we proceed as in Definition 10.1.1 and set 


d£(tP)^ := -£[tp,) 

as 


Clearly, d£[ip) e is an element of the dual space since it maps a tangent vector 

onto a scalar (cf. Definition 4.3.3 of the dual space). In general, this is just an abstract 
vector space, but here we can identify T.^pQ with a subvector space of A”. For a detailed 
description of the mathematical structures (manifolds and tangent bundles), we refer to 
[MR99, Chapter 4.1]. 


10.1.4 Lagrangian mechanics 

One extremely important example where the variation takes place over a non-linear space 
is that of classical mechanics. Here we start with the space of paths 

I?(xo,Xi):={q:[0,r]^d:’ | q q(0) = Xq, q(r) = Xi} (10.1.4) 


2014.03.06 
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which join Xq and in the Banach space X; initially, let us concentrate on the case A” = 
but more general choices are also admissible. As we will see later on, the choices of 
Xq, Xj and T are completely irrelevant, the Euler-Lagrange equations will be independent 
of them. 

The idea is to derive classical equations of motion from the Euler-Langrange equations of 
the action functional 


S(q) := 


rT 

dtL(q(t),q(t)) 

Jo 


(10.1.5) 


where L e x ]R‘*) is the Lagrange function; L(x,v) depends on position x and 

velocity v (as opposed to momentum). Physicists call this principle of stationary action. 

We exploit the linear structure of df = and propose that we can canonically identify 
the tangent space TqV[xQ,Xi) with I?(0,0): if h e V[0,0), then by definition q +sh e 
I?(xo,Xi) is a path which joins Xq and x^ for all s e R with tangent vector h. The Euler- 
Lagrange equations can now be derived easily: by definition dS(q) = 0 means dS(q)h = 0 
holds for all h e I?(0,0), and we compute 


dS(q)h= — S[q+sh) 
ds 


s=0 


dt ^i(q(0 + sh(t), q(t) + s/i(t)j 


s=0 


df [v^L(q(t),q(t)) • h(t)-I-V^L (q(t),q(t)) -/iCtjj 

0 

df |^V^L(q(t),q(t)) - — V^L(q(t),q(t)) j -hCt). 


( 10 . 1 . 6 ) 


Note that the boundary terms vanish, because h(0) = 0 = h(T). Clearly, the Euler- 
Lagrange equations 


V^L(q(t),q(t)) - ^V,L(q(t),q(t)) = 0 (10.1.7) 

are independent of Xq, x^ and T - as promised. Moreover, the linear nature of (E = is 
not crucial in the derivation, but a rigorous definition is more intricate in the case where 
X is a manifold (cf. [MR99, Chapter 7]). 

10.1.4.1 Classical mechanics on R^^ 

The standard example for a Lagrangian is L(x, v) = — U[x) where U is the potential 

(we use U rather than V to avoid the ambiguity between the velocity v and the potential 
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V). A simple computation yields 

0 = V^L(q(t),q(t)) - —V^L(q(t),q(t)) 

= -V;,[/(q(t)) -mq(t) 

or mq = —V^U. This second-order equation can be reduced to a first-order ODE by setting 
q = V, and one obtains 


Glancing back at the beginning of Chapter 3, we guess the simple change of variables 
p :=mv and recover Hamilton’s equations of motion (3.0.2) . In fact, this innocent change 
of variables is an instance of a much deeper fact, namely that momentum can be defined 
as 



p := V^L. 


10.1.4.2 Derivation of Maxweii’s equations 

The idea to derive the dynamical equations of a physical theory as a critical point from 
an action functional is extremely successful; almost any physical theory (e. g. general 
relativity, quantum electrodynamics and fluid dynamics) can be derived in this formalism, 
and hence, a better understanding of functionals gives one access to a richly stocked 
toolbox. Moreover, they yield equations of motion in situations where one wants to couple 
degrees of freedom of a different nature (e. g. fluid dynamics and electrodynamics). 

To illustrate this, we will derive the vacuum Maxwell equations (cf. Chapter 5.5 for 
e = 1 = /r). It all starts with a clever choice of Lagrange function, in this case 


Lit, A, (f, a, ip) 


dxL(t,A(x), <^(x), a(x), (p(x)) 


dx ri|-a(x)-V^(^(x)|^ - i|v^ xA[x)\^ + 


+ j[t,x)-A[x) - p(t,x) (^(x)j 


where A is the vector potential, (f the scalar potential, j the current density and p the 
charge density. Because of charge conservation 


^x-i + 3tP = 0 , 


( 10 . 1 . 8 ) 
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2014.03.12 


current and charge density are linked. The potentials are linked to the electromagnetic 
field via 


E = -8,A-V,4>, 
B = X A 


Given that L is defined in terms of a quadratic polynomial in the fields, we can easily 
deduce the equations of motion from the action functional 




fT 

dt L (A(t), (pit), d^Ait), d^cpit)) 
Jo 


(10.1.9) 


where (A, cp) is a path in the space of potentials. Physicists usually use 5 to denote (what 
they call) “functional differentiation”, and the Euler-Langrange equations (10.1.7) are 
expressed as 


d 5L 5L 

dt 5(p 5<p 

d 5L 5L 

dt SA 5A 

These can be computed by pretending that the integrand is just a polynomial in A, d^A, 
(p and df(p. We postpone a proper derivation until after the discussion of these two 
equations. 

The Lagrange density L is independent of cp = df-cp so that partial integration yields 
Gauji’s law (charges are the sources of electric fields). 


(10.1.10a) 

(10.1.10b) 


V^-(-5,7l-V^<j))=V^-E = p. (10.1.11) 

This equation acts as a constraint and is not a dynamical equation of motion. Note that 
since B = xA is the curl of a vector field, it is automatically divergence-free, - B = 0. 
This takes care of the two constraints, equations (5.5.2) for e = 1 = p. 

Equation (10.1.10b) yields both of the dynamical Maxwell equations, starting with 

-d^A-V^d,(p = 3tE = X V^A-j = x B-j. (10.1.12) 

To obtain the other dynamical Maxwell equation, we derive B = x A with respect to 
time and use x V^cj) = 0: 


= V, X -V, X [^-d,A-VA) 

= -Vx X E 
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Hence, we obtain the usual Maxwell equations after introducing a pair of new variables, 
the electric field E and the magnetic field B. These fields are independent of the choice of 
gauge, but more on that below. 

Solutions to the Maxwell equations are stationary points of the action functional (10.1.9) , 
and a proper derivation involves computing the functional derivative: 


(S(A0))(a,(p) = —S{A + sa,(p+sip) 


5=0 


j-T 

r d 

fl 

dt 

dx — 


Jo * 

*3 ds 

12 


—d^A— V —sdf-a —sV^ip 


+ 


V„ X A + s V„ X a 


+ 


+ j- A-p<p+sj-a-spip 


5=0 


I'T I' 

dt 

0 Jl 


dx (^{-d,A-V^(j)) ■ {-d^a - V^^p) + 
-(V^ xA) ■ (V,, xa)+j■a-p^p^ 


rt 

dt 

0 J 


dx 


(^[-dfA-V,d,ci>-V, xv,xA + j) 


a+ 


Setting dS(A,(j)) = 0 yields equations (10.1.12) and (10.1.11) . 


Eliminating the constraints The presence of the constraint equations 

• E = p 
V,-B = 0 

means we can in fact eliminate some variables. The idea is to decompose the electromag¬ 
netic fields 


E-E|| +Ej_, 

B = B|| -|- Bj^ 

into longitudinal (||) and transversal (_L) component. Transversal fields are those for 
which • Ej^ = 0 holds while the longitudinal component is simply the remainder Ey = 
E — Ej^ = which can always be written as a gradient of some function; on this 
Helmholtz decomposition of fields is unique. 
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Theorem 10.1.2 (Helmholtz-Hodge-Weyl-Leeray decomposition) The Hilbert space 

decomposes into the orthogonal subspaces 

J := kerdiv = ran curl (10.1.13) 

and 

G := ran grad = kercurl. (10.1.14) 

In other words, any vector field E = Ey + Ej^ can be uniquely decomposed into Ey e G and 
E^ ej. 

Proof (Sketch) We will disregard most technical questions and content ourselves show¬ 
ing orthogonality of the subspaces and J n G = {0}. Note that since C“(]R^,C^) is dense 
in it suffices to work with vectors from that dense subspace. The proof of 

equation (10.1.14) can be found in [TemOl, Chapter I, Theorem 1.4, equation (1.34)]; 
equation (10.1.14) is shown in [TemOl, Chapter 1, Theorem 1.4, equation (1.33) and 
Remark 1.5] and [ Pic98, Theorem 1.1]. 

Let us start with orthogonality: Pick 0 = e G nC“(]R^,C^) where p e C“(]R^) 
and i/) e J = kerdiv. Then partial integration yields 

(i/),(^) = =0, 

meaning that the vectors are necessarily orthogonal. 

It is crucial here that the space of harmonic vector fields 

Har(]R^,R^) := kerdivn kercurl = J n G = {0} 


is trivial, because by 

X X (^ = (V^ ■ - A^</) 

the subvector space consists of functions which component-wise satisfy = 0. But on 
C“(]R^,]R^) and L^(R^,]R^), this equation has no non-trivial solution. □ 

On bounded subsets of there are harmonic vector fields because then at least the 
constant function is square integrable. Then the Helmholtz splitting is more subtle. 

Clearly, by definition By = 0 and V^. • E = • Ey = p. Moreover, x E = x Ej^ 

holds so that we obtain 


By(t) = 0, 

Vx •Ey(t) = p(t). 
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Note that the last equation can be solved with the help of the Fourier transform. 

The two dynamical contributions now only involve the transversal components of the 
fields, 


^tE± = xB_L-j, 

SfBi = -V^ X E_l. 

Emergence of E and B One may wonder what motivates one to set E = —d^A— and 
B = X A. Let us re-examine the Euler-Lagrange equation for the second variable, 

-d^A-v,d,4> = v,xv,A-j. 

The left-hand side involves second-order time-derivatives, and if we want to write it as 
first-order equation, we can introduce the new variable E = —d^A — to obtain 

A M 

dt I^eJ v,xa j- 

Gauge symmetry The constraint equation (10.1.11) is related to a continuous symmetry 
and leads to a conserved quantity. The relation between the two, a continuous symme¬ 
try and a conserved quantity, is made precise by Noether’s theorem (cf. [MR99, Theo¬ 
rem 11.4.1]). Here, the gauge symmetry of the action functional leads to (10.1.11) : if 
X : K X —> R is a smooth function depending on time and space, then a quick compu¬ 
tation shows 


S{A,cI})=S{A+V,x,<P-3,x) 

holds because the extra terms in the first two terms cancel exactly while those in the last 
two cancel because of charge conservation (10.1.8). 


10.1.5 Constraints 

Two of the four Maxwell equations describe constraints, and we have seen how to fac¬ 
tor out the constraints in the dynamical equations by splitting E and B into longitudinal 
and transversal components. This represents one way to deal with constraints, one intro¬ 
duces a suitable parametrization (coordinates) to factor them out. Unfortunately, this is 
often non-obvious and practically impossible. But fortunately, one can apply a well-known 
technique if one wants to find extrema for functions on subsets of R‘* under constraints, 
Lagrange multipliers. 
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For simplicity, let us reconsider the case of functions on . There, the idea of Lagrange 
multipliers is very simple: assume one wants to find the extrema of / : R'* —> R under 
the constraints 


gM = (gi(x),..., g„(x)) = 0 e R". 

Then this problem is equivalent to finding the (ordinary) extrema of the function 

F : R'* X R" —> R, F(x, A) := /(x) + A • g(x) 

and the condition for the extrema, 

V^/(x) + AV^g(x) = 0, 
g(x) = 0, 

shows how the recipe of using Lagrange multipliers works behind the scenes. There is 
also a simple visualization in case n = 1: setting Vj./(x) + A V^g(x) = 0 means V^/(x) 
and V^g(x) are parallel to one another. Assume, for instance, that Xq is a local maximum, 
but that Vj./(xo) and V^g(xo) are not parallel. First of, Vj.g(x) is always normal to the 
surface defined by g(x) = 0, and we can split V^/(xo) = vy + vj^ where vy || V^g(xo) and 
vj^ is orthogonal. Then vj^ 7^ 0 means we can increase the value of /(x) along g(x) = 0 
by going in the direction of vj^, because 

= vl>0. 

The argument for a local minimum is analogous, one simply has to walk in the direction 
opposite of vj^ to lower /(x). Put another way, tangential to the surface, all components 
of V,f/(x) need to vanish, but along the surface normal V^/(x) need not be zero. But it is 
prohibited to travel along this direction, because one would leave the surface {g(x) = 0}. 

To go back to the realm of functionals, we merely need to translate all these ideas 
properly: consider a functional £ X —> R restricted to 

U :={x&n I jF(x) = 0} 

described by the constraint functional J. 

Theorem 10.1.3 Let £ and J both be functionals and assume Xq&U is a critical point 
of£\u. Then there exists A e R such that 

dF(xo) + A djFCxo) = 0. 

Proof Let Xj be a differentiable path in U for the tangent vector ^ i- it is 

a differentiable path in Tl such that JCxj) = 0. Then JCxj) = 0 implies automatically 
(17(xo)^ = 0 for all such paths, i. e. ^ e kerd7(xo) (“dJ(x) is normal to J(x) = 0”). 
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The fact that Xq is a critical points of £ in [/ means 


d£(xo)? = 



s=0 


= 0 , 


meaning ^ e kerJ(xo) at critical points implies ^ e kerd£(xo) (“d£(xo) and ddr(xo) are 
parallel”). □ 


The proof once again states that in the tangential direction to U, one needs to have 
dfCxg) = 0 while in the normal direction, d£(xo) / 0 is permissible because it is not 
possible to travel in this direction as one would leave the “surface” U = {J(x) = 0}. 


Example Assume we would like to find the equations of motions of a particle of mass 1 
moving on the surface of the sphere c with radius 1. Then the Lagrange function 
describing this motion is just the free Lagrange function in namely L(x, v) = i v^. The 

constraint functional is described in terms of the function J(x, v) = ^ (x^ — l)^, 


J(q) := 


j-T 

dtJ(q(t),q(t)). 

Jo 


Paths on the sphere satisfy Jiq) = 0. Equation (10.1.6) now yields a very efficient way to 
compute dS + A dff, namely for any path q on the surface of the sphere, we obtain 


{dS + /[.dj)h 


f d A 

dt |^V^(L + AJ^)(q(t),q(t)) - —V^(L + A (q(t),q(t)) j - h^t) 


rT 


dt (-q(t) + A(q(t)2-l)q(t))-h(t). 


By construction, h(t) is tangent to the sphere at q(t), i. e. h(t) needs to be perpendicular 
to q(t), and thus the second term vanishes (the term also vanishes because q(t)^ = 1, 
but the other argument still holds true if we change the constraint function). In fact, q(t) 
is always normal to the tangent plane, and thus saying that q(t) is perpendicular to the 
plane really means 


q(t) = A(t)q(t). (10.1.15) 

Writing this equation as a first-order equation, we introduce the variable v(t) := q(t) 
which by definition is tangent to at q(t). Because we are in three dimensions, there 
exists a unique vector cu(t) with (u(t) • q(t) = 0 (a)(t) must lie in the tangent plane), 
m(t) • q(t) = 0 and 


q(t) = m(t)xq(t). 


2014.03.13 
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because cu(t) is the vector which completes {q(t),q(t)} to an orthogonal basis of and 
is proportional to q(t) x q(t). In fact, the above equation implies £u(t) = q(t) x q(t) since 

q(t) X q(t) = q(t) X (m(t) x q(t)) = |q(t)f m(t) - q(t) • m(t) q(t) = m(t). 

=1 =0 

Deriving the left-hand side with respect to time yields an equation of motion for co, 
d d , 

xq(O) =q(0 X q[t) + q[t) xq(t) = q(t) x q(t), 

and with the help equation (10.1.15) , we deduce = 0. That means co = a)(0) = 

= q(0) X q(0) is constant in time and the equations of motion reduce to 

q(t) = mxq(t). 

We have solved these equation numerous times in the course, the solutions are rotations 
around the axis along co with angular velocity |m|, just as expected. 

10.1.6 Complex derivatives 

To frame the discussion, we will quickly recap why complex derivatives of functions are 
fundamentally different from real derivatives. In what follows let us denote complex 
numbers with z = z^g -I- izjj^ e C where z^g, Zj^ e R are real and imaginary part. We also 
split any function / : C —> C into real and imaginary part as / = /^g -I- i/j^j where now 
/Rej/im • —> R. The identification of C = R^ via z z := (zRe>2im) allows us to think 
of 


SzfM= lint 

z—Zo 


/(z)-/(zo) 

Z-Zo 


(10.1.16) 


as a limit in R^. As we are in more than one (real) dimension, the above equation im¬ 
plicitly assumes that the limit is independent of the path taken as z ^ Zg. To simplify the 
notation a little, we will assume without loss of generality that Zg = 0 and /(O) = 0. Then 
the existence of the limit (10.1.16) implies that 


■3 r 1 ■ ftie C^Re ) "b 1 /lm C^Re ) 3 r 

dj{0)= hm^-- = d^J{0) 


ZBe—0 


*Re 


.. /im Cl^Im) ~b 1/lm Cl^Im) 

= lim -= -i4 /(O) 

z -.n i-z 

^ IZim 


= - 0 /( 0 ) 
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are in fact equal. Thus, equating real and imaginary part, we immediately get the Cauchy- 
Riemann equations, 

d^jReiO) = (10.1.17a) 

d.jReW = (10.1.17b) 

This reasoning shows that complex differentiability of / (i. e. the limit in (10.1.16) exists) 
implies (10.1.17) . In fact, these two are equivalent, a function is complex differentiable 
or holomorphic if and only if 

3i/(z) = I (5.^, + i3.,„)/(z) = 0. 

So let us turn our attention back to functionals. The idea here is the same: we can identify 
each complex Banach space A’ = 0 iA’jj as a Banach space over R whose dimension is 

twice as large via the identification x x = (Rex,Imx) = (xf;e,Xin,). Hence, we can 

associate to any functional £ X —> C another functional on 0 iffit via 

£:R(x) = £’(xRe +ix,„). 

For this functional, we can take (partial) derivatives with respect to x^g and Xjn, as before 
(we are in the setting of functionals over a real Banach space again) as well as define the 
gradient d^i. Then analogously to derivatives on C, we define the complex partial 

8^£[x) := d^^^£[x)-id^^£[x), (10.1.18a) 

d^£[x) := d^^^£[x) + id^^^£[x). (10.1.18b) 

and the complex Gateaux derivatives 

d£(x) = d^£M := d^^^^^fCx) - id^^^£[x), (10.1.19a) 

df(x) = d^E^x) := d,f^^f(x) + id^j^£(x). (10.1.19b) 

Now that derivatives have been extended, let us define the notion of 

Definition 10.1.4 (Critical point) Let £ X —> Cbe a functional Then Xg e H is 

a critical point if and only if df (xg) = 0. 

One way to compute df (x) is to treat x and x as independent functions and compute the 
corresponding partial Gateaux derivative. Note that in case of polynomials. 

The reason why we have chosen to take the derivative d instead of d is just a matter of 
convention and convenience: in the context of Hilbert spaces, we can write 

d£[-ip)ip = {ip,£X-ip)) 

as a scalar product; the vector £'['4^) is defined via the Riesz representation theorem 4.3.5. 
If we had used d instead, then Efip) would appear in the first, the anti-linear argument 
of the scalar product. 
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Derivation of the Ginzburg-Landau equations As a simple example, let us derive the 
Euler-Lagrange equation for the Ginzburg-Landau energy functional. 


£ni'4’,A') 


dx (|(-iV,, -Aix))xp[x)\^+ y(|'0(x)|^ - (V^ xA(x))^j. 

•/ 

( 10 . 1 . 20 ) 


It describes the difference in Helmholtz free energy between the ordinary and supercon¬ 
ducting phase of an ordinary superconductor in the bounded region H c which is 
subjected to a magnetic field B = x A. Here, ip is an order parameter which describes 
whether the material is in an ordinary conductor ip = 0 or the electrons have formed 
Cooper pairs which carry a superconducting current iip ^0). 

Admissible states are by definition critical points of the Ginzburg-Landau functional. 


—£^(ip+sip,A + sa) 


s =0 


■■ 2Re 


+ 


dx (-iV^ - A)t/)-I-X X a]- 

Jn 

+ 1^ ((-iV^ -aYiP -k\1- \iP\^) t/>) = 0, (10.1.21) 


and leads to the Ginzburg-Landau equations, 


X xA = Re(tj) (-iV^ -A)ip) =:j[ip,A), 
0 = {-iV^-Ayip-K^{l-\ip\^)ip. 


(10.1.22a) 

(10.1.22b) 


We could have also obtained these equations by deriving the integrand with respect to ip. 

Clearly, for all A which describe constant magnetic fields B = xA the normal con¬ 
ducting phase Ip = 0 as well as the perfect superconductor ip = 1 are solutions, and the 
question is for which values of k and B there are other solutions (mixed phase where 
normal and superconducting states coexist). We will continue this example later on in the 
chapter. 


10.1.7 Second derivatives 


When we introduced the Gateaux derivative for functionals, we have claimed that one can 
compute second- and higher-order derivatives in the same fashion. We choose to illustrate 
the general principle with the Ginzburg-Landau equations: the Euler-Lagrange equations 
can either be understood as the integral expression (10.1.21) or the gradient 


£'(ibA)=[ ^ 

^ l^V,xV,xA-Re(t/,(-iV,-A)tj,)j 


(10.1.23) 
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defined by {[ip,a),£^[xl),A)) := [d£^['ip,A))[a,(f). The Hessian £” of now is just the 
linearization of £'^[%p'), 


(£'^[ip,A))iip,a):= —£'^(\p+sip,A + sa) 


Equivalently, we could have used the quadratic form 


(10.1.24) 


((i?, a), a)) 


dsdt 


£^{'ip + s(p + tr], A + sa + ta) 


as a definition. A somewhat lengthier computation yields an explicit expression, 


(£"[%p,A))[ip,a) = 

_ f + (2 |t/)p — l) (^ + t/)^ — 2 (—iV^ — A)t/) ■ a + it/) V 

(Vx X Vx + \'ip\^)a- Re (Tp (-iVx -A)xp + i) (-iVx - Aji/?) 


and the take-away message here is that £'^[tp,A) is an R-linear map, and thus, we can use 
the tools of functional analysis to probe the properties of the Hessian. 

Quite generally, the Hessian of a functional £ is just the second-order Gateaux derivative. 


{P,£"Wa) 


ds dt 


£[ip -bsa-l-1^) 


s=t=0 


which implicitly defines a linear partial differential operator. The properties of the Hessian 
characterize the behavior of the fixed point: is it a local maximum, a local minimum or a 
saddle point? 

In short, we have the following hierarchy: We would like find the critical points of a 
given a functional £ and characterize them. The critical point equation is then equivalent 
to a non-linear PDE whose solutions are the fixed points. The linearization of this PDE at a 
critical point yields a linear PDE which serves as the starting point for the stability analysis 
of that fixed point. This is exactly what we have done for ODEs in Chapter 2.3: the non¬ 
linear vector field determines the fixed points while its linearization can be used to classify 
the fixed point. Hence, functionals generate many interesting linear and non-linear PDEs. 


10.2 Key points for a rigorous analysis 

Up to now, we have not been very rigorous. A careful reader will have noticed that 
even though we have defined a notion of differentiability, we have not defined continuity 
which in real analysis precedes differentiability. That was a conscious choice, because the 
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question of continuity for functionals turns out to be much more involved. But continuity 
is not necessary to define the Gateaux derivative: 

So let us sketch what is involved in a rigorous analysis of functionals. The purpose here 
is not to completely cover and overcome all of the difficulties, but to merely point them 
out. So let us consider a functional E on X, and assume we would like to find a 
minimizer, i. e. an element Xq e for which 

£(xo)= inffCx), 

X€Q. 

and whether this minimizer is unique. Clearly, we have to assume that the functional is 
bounded from below, 


Eq := inf £i(x), 

x€Q. 

otherwise such a minimizer cannot exist. 

Now let us consider the case where is a closed subset of i. e. we are looking at 
functions (in the ordinary sense) from R'^ to R. We split this minimization procedure in 3 
steps; this splitting is chosen to 

(1) Pick a minimizing sequence {x„}ngjij for which limn_,oo f (x„) = Eq. Such a sequence 
exists, because £ is bounded from below. 

(2) Investigate whether {x„}ngf^ or at least a subsequence converges. Just imagine if 
£ has two minima Xq and Xq. Then the alternating sequence would certainly be a 
minimizing sequence which does not converge. Another situation may occur, namely 
that the minimizer is located at “oo”: take £(x) = e~^ on R'^, then no minimizer 
exists. 

So how do we show that {x„}„gp 5 j has a convergent subsequence? Let us first assume in 
addition that£ is coercive, i. e. £(x) oo if and only if ||x|| ^ oo. In that case, we may 
assume that all the elements in the minimizing sequence satisfy £(x„) < Eq + 1 (just 
discard all the others). Then since £ is coercive, also ||x„|| < C holds for some C > 
0, and the existence of a convergent subsequence follows from Bolzano-Weierstrass 
(every bounded sequence in R'^ has a convergent subsequence). For simplicity, we 
denote this convergent subsequence by {Xn}„gi^. 

(3) Then the limit point of this convergent subsequence, Xq = lim„_oo is a candidate 
for a minimizer. If £ is continuous, then by 

Eq = lim £(Xn) = £( lim x^l = £i(xo) 


Xg is a minimizer of £. 
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Not surprisingly, things are more complicated if X is infinite-dimensional. 

(i) First of all, showing that a functional is bounded from below is not as immediate 
as in the case of functions. For instance, consider the energy functional (10.0.1) 
associated to H = — -F V where V is not bounded from below (think of something 
like the Coulomb potential). Then it is a priori not clear whether H - and thus £ - is 
bounded from below. 

(ii) In point (2) the Bolzano-Weiserstrass theorem was crucial to extract a convergent 
subsequence. Things are not as easy when going to infinite dimensions, because 
the unit ball {||x|| < 1 | x e d:”} is no longer compact. However, if the bidual X” 
is isometrically isomorphic to d:” (i. e. d:” is reflexive), then by the Banach-Alaoglu 
theorem [RS72, Theorem IV21] every bounded sequence has a weakly convergent 
subsequence (cf. Definition 4.3.6). For instance, if df is a Hilbert space, it is reflexive. 

(iii) The essential ingredient in (3) was the continuity of the functional, but this is usu¬ 
ally either very hard to prove or even wrong. However, for the purpose of finding 
minimizers, weak lower semi-continuity (w-lsc) suffices, i. e. 

x„^Xo liminf£(x„) > f (xq). 

n—*oo 

Because if x^ is initially a minimizing sequence, then by definition 

lim £(x J = Fo > f Uo) >-Eo (10.2.1) 

n—>co 

minimizes the value of the functional. 

Theorem 10.2.1 Assume that 

(i) Q.Q X is closed under weak limits (weakly convergent sequences converge in fl), 

(ii) 8 is weakly lower semi-continuous, and 

(iii) 8 is coercive. 

Then 8 is bounded from below and attains its minimum in i. e. there exists a possibly 
non-unique minimizer in Tl. 

Proof Set Eq := in4gQ F(x) and pick a minimizing sequence {xn}„gpj, i. e. a sequence with 
Eq = lim„_oo F(x„). At this point, we do not know whether Eq is finite or — oo. In case Eq 
is finite, we may assume without loss of generality that F(Xn) < Fq +1 (simply discard all 
elements for which this does not hold). By the Banach-Alaoglu theorem, {x^lngp^ contains 
a weakly convergent subsequence for which x^^ ^ Xq. The point Xq is the 

candidate for the minimizer. Given that 8 is w-lsc, we conclude limi^^oo^C^nt) — 
However, in view of (10.2.1) we already know F(xo) = Eq which not only shows the 
existence of a minimizer, but also that Eq> —oo. □ 


2014.03.25 
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One application of this theorem is to show the existence of solutions to a PDE, and the 
strategy is as follows: Find a functional so that its Euler-Lagrange equation are the PDE 
in question. Then the existence of a minimizer implies the existence of a solution to the 
PDE, because the PDE characterizes the critical points of the functional. 

To get uniqueness of solutions, we have to impose additional assumptions. One of the 
standard ones is convexity which is defined analogously to the case of functions. 

Definition 10.2.2 (Convex functional) Let Q cz X be a convex subset and £ : H —> R a 
functional. 

(i) £ is convex iff £[sx + (1 — s)y) < s £(x) + (1 — s) £[y) for all x,y e and s e [0,1]. 

(ii) £ is strictly convex iff £(sx + (1 — s)y) < s£(x) + (1 — s)£[y) for all x ^ y and 
s e (0,1). 

Convexity has strong implications on minimizers just like in the case of functions. 

Proposition 10.2.3 (i) Every local minimum of a convex functional is a global minimum. 

(ii) A convex functional has at most one minimizer. 

Proof (i) Assume Xq is just a local, but not a global minimum. Then there exists a 
neighborhood V of Xq such that f (xq) < £(x) for all x e V (local minimum) as well 
as a point y ^V with £(y) < £(xq) (local but not global minimum). Connecting Xq 
and y by a line segment, convexity implies in conjunction with £(y) < £(xq) 

£(sy + (1 - s)xo) < s£(y) + (1 - s)£(xq) < £(xq). 

Hence, for s small enough, x' = sy + (1 — s)xo e V and we have found points in 
V for which £(x’) < £(xq) - in contradiction to the assumption that Xq is a local 
minimum. Hence, every local minimum is also a global minimum. 

(ii) Assume there exist two distinct minimizers Xq and x' with £^(xo) = Eq = £(x'^). 
Then strict convexity 

Eq <£(sxo + (1-s)Xq) < s £(xo) + (1 - s)£(x'g) = Eo 

leads to a contradiction, and the minimum - if it exists - is unique. □ 

Note that the definition of convexity does not require £ to be once or twice Gateaux- 
differentiable. However, if we assume in addition that the function is once or twice differ¬ 
entiable, we obtain additional characterizations of convexity. 

Proposition 10.2.4 Assume £ : H —> R is Gateaux differentiable. 
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10.2 Key points for a rigorous analysis 


(i) £ is convex ijf £[x) > £[y) + (d£(y))(x — y'jfor all x,y 

(ii) £ is convex iff (d£(x) — d£(y))(x — y) > 0 /or all x,y e 

CiU) £ is convex and twice Gateaux differentiable iff (y, d^£(x)y) > Ofor all x,y e 

Proof (i) The convexity of £ can be expressed by regrouping the terms in the 
definition, 


£:(y+s(x-y)) < £(y)+s(£:(x) - fCy)), 
which is equivalent to 


£M-£[y)> 


g(y+ 5 (x-y)) -g(y) 
s 


Taking the limit s \ 0 3 aelds 


£[x)-£iy)> (d£(y))(x-y) 

which implies £(x) > £{y)+ (d£(y))(x — y). 

Upon substituting x x and y y +s(x — y) as well as y x and y 
y+s(x-y) yields 

£{.x)>£{y+ s(x - y)) + (1 - s) (^5 (y + s(x - y)) j (x - y), 

£iy)>£iy+ s[x - y)) - S (^£ (y + s(x - y)) j (x - y), 

and if we multiply the first inequality by s, the second with 1 — s and add the two, 
we obtain £(sx + (1 — s)y) < s £[x) + (1 — s)5(y). 

(ii) From (i) we deduce £[x) > £[y) + (d£(y))(x — y) and £(y) > £[x) + 

(5(x))(y — x); adding the two yields the right-hand side. 

Set/(s) := £:(y-l-s(x-y)). Then//s) = (^d£(y +s(x -y)) j(x -y), and 
by assumption 

/'(s)-/'( 0 )= (^d£(y-l-s(x-y)) - £(y)](x - y) > 0 
holds. Integrating this equation with respect to s over [0,1], we obtain 
/(I) - /(O) - /'(O) = £(x) - 5(y) - (d£:(y))(x - y) > 0, 


which by virtue of (i) is equivalent to convexity. 
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(iii) The convexity of £ means 

(^d£ (x + sy ) - £(x) j (sy) > 0 , 

and consequently, 


(y, (d 2 £-(x))(y))=lim 

s\0 

= lim 

s\0 


(^df (x+sy) -£:(x)j(sy) 
s 

(^d£:(x+sy) -£(x)jy. 


“<J=:” The convexity follows from a Taylor expansion of £. □ 

There is an analogous version of the proposition characterizing strict convexity; since the 
proofs are virtually identical, we leave it to the reader to modify them appropriately. 

Corollary 10.2.5 Assume £ : Q —> R is Gateaux differentiable. 

(i) £ is strictly convex iff £[x) > £[y) + (d£[y))[x — y)for all x ^ y 

(ii) £ is strictly convex iff (d£(x) — df (y))(x — y) > 0/or all x ^ y 

(iii) £ is strictly convex and twice Gateaux differentiable iff (y, d^£(x)y) > 0/or all x ^ y 

Now we go hack to the problem at hand, existence of minimizers. The first step is the 
following 

Lemma 10.2.6 Suppose £ is a convex functional defined on a closed subset Q. Q X of a 

reflexive Banach space X; moreover, we assume its Gateaux derivative exists for all x e £1. 

Then £ is w-lsc. 


Proof Pick an arbitrary x ^ X which we leave fixed. Moreover, let be a se¬ 

quence which converges weakly to x. Then by the characterization of convexity in Propo¬ 
sition 10.2.4 (i) yields 


£[xj > £(x) + (d£'(x))(x„ - x). 

Upon taking the limit n ^ oo, the last term on the right-hand side vanishes since x„ ^ x, 
and we deduce £ is weakly lower semicontinuous, 

lim £:(x„) > £(x). □ 

n—>oo 


Theorem 10.2.7 Assume 
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10.3 Bifurcations 


(i) Q.C. X is closed under weak limits (weakly convergent sequences converge in Q), 

(ii) £ is strictly convex and 
(in) £ is coercive. 

Then there exists a unique minimizer. 

Proof The convexity of £ implies weakly lower semicontinuity (Lemma 10.2.6). Hence, 
Theorem 10.2.1 applies and we know there exists a minimizer, and in view of Proposi¬ 
tion 10.2.3 (ii) this minimizer is necessarily unique. □ 

As mentioned earlier, variational calculus can be used to show existence of solutions if 
the PDE in question coincide with the Euler-Lagrange equations of a functional. If the 
functional is in addition strictly convex, then our arguments here show that the solution 
is unique. 


10.3 Bifurcations 

The idea to understand certain physical phenomena as incarnations of a “bifurcation” 
where certain properties of a system change abruptly. Before we show how this meshes 
with the preceding content of the chapter, let us quickly explore one incarnation from 
physics in order to introduce some of the necessary terminology. 

Phase transitions, for instance, are points where some order parameter changes abruptly 
when an external parameter is changed. Eor instance, in a ferromagnet the order parame¬ 
ter is the macroscopic magnetization while the external parameter is temperature. Below a 
critical temperature, the Curie temperature, the microscopic magnets can align to produce 
a non-zero macroscopic magnetization. For instance, another magnet can be used to align 
these microscopic magnets. This magnetization persists even when the magnet is heated 
- up to a certain specific critical temperature when the magnetization suddenly drops to 
0. In summary, below the critical temperature, there are two states, the unmagnetized 
state where the macroscopic magnetization M(T) = 0 vanishes and the magnetized state 
where M(T) ^ 0. Above the critical temperature, only the M(T) = 0 state persists. Other 
physical effects such as superconductivity can be explained along the exact same lines. 

To tie this section to the theme of the chapter, let us consider a parameter-dependent 
functional £ : R x V —> R where V is dense in some Hilbert space TL; we will denote 
the external parameter with p and the Banach space variable with x, i. e. £(p,x). The 
bifurcation analysis starts with 


F(p,x) := d^£(p,x) 
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which enters the Euler-Lagrange equations and determines the stationary points of the 
functional. Here, the partial derivative is defined in terms of the scalar product as 

(d^£'(/x,x))(y) = {y,d^E{ix,x)). 

In what follows, we assume that the “normal solution” x = 0 solves F(/r, 0) = 0 for all 
p e R and we want to know whether there is a bifurcation solution x(/x) / 0. Then in the 
present context, we define the notion of 

Definition 10.3.1 (Bifurcation point) (/Tq, 0)is a bifurcation point if there exists x(p) on 
a neighborhood [|Uo,|Uo + 5) so that x(/r) ^ 0 on (pig, Mo + 5) F (^ji,x[jjL}) = 0. 

A consequence of the Implicit Function Theorem [TeslO, Theorem 1.6] is the following 

Proposition 10.3.2 if (mo> 0) is a bifurcation point, then d^F(Mo, 0) does not have a bounded 
inverse. 

Proof (Sketch) Because if d^F(Mo, 0) were invertible, then by the Implicit Function The¬ 
orem [TeslO, Theorem 1.6] we could extend the trivial solution x(m) in a vicinity of Mo- 
But given the multivaluedness, this is clearly false. □ 

The last example has shown us that d^F(Mo,0) not being invertible is just necessary but 
not sufficient. Hence, we need to impose additional conditions, and one of the standard 
results in this direction is a Theorem due to Krasnoselski [KZ84] 

Theorem 10.3.3 (Krasnoselski) Assume V ^ FL is a dense subset of a Hilbert space and 
the map F : R x I? —> FL is such that 

(i) d^F(M, x) is a mop at (Mo,0 ), 

(ii) 0 ) is selfadjoint, 

(in) 0 is an isolated eigenvalue of odd and finite multiplicity, and 
(iv) there exists v e kerd,fF(Mo, 0 ) such that (v, 5^d^F(Mo, 0 )v) ^ 0 . 

Then (mo, 0) is a bifurcation point. 

The main ingredient is a procedure called Lyapunov-Schmidt reduction: define the linear 
operator L(m) := d^F(M, 0) and the orthogonal projection P onto kerL(Mo). The projec¬ 
tion and its orthogonal complement P-*- = 1 — P induce a splitting of the Hilbert space 

FL = ranP © ranP-*-. 
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Hence, any x = xy + xj^ can be uniquely decomposed into xy from the finite-dimensional 
space ranP and xj^ e ranP-*-. Consequently, also the equation P(x,/t) = 0 e "H can 
equivalently be seen as 

Py(/t,xy,xj^) := PP(/t,xy + Xj^) = 0 £ tanP, (10.3.1a) 

Pj^(/t,xy,xj^) := P"*'P(/t,xy + xj^) = 0 £ ranP"*". (10.3.1b) 

The idea of the Liapunov-Schmidt reduction is to solve these equations iteratively: Given 
(/r,Xy) we first solve the branching equation 


p^(/r,xy,x^) = 0 


in the vicinity of the branching point (/Tq, 0). By assumption d^j^P(/ro, 0,0) has a bounded 
inverse, and thus, the Implicit Function Theorem [TeslO, Theorem 1.6] yields a solution 
xj^(/r,xy). We then proceed by inserting xj^(/r,xy) into Py, thereby obtaining a function 

/(/r,Xy) :=Py (/r,Xy,X_L(/t,Xy)) 

that depends only on finitely many variables (n = dimkerL(/ro) is finite by assumption), 
i. e. / : R X R" —> R". The remaining equation /(/r, xy) then needs to be solved by other 
means. 

Proof We will use the notation introduced in the preceding paragraphs. First of all the 
map 


Pj^ : (R X PP>) X P-^V —> ranP-*- 

satisfies the assumptions of the Implicit Function Theorem [TeslO, Theorem 1.6]: evi¬ 
dently, it inherits the property from P, and Pj^(/r, 0,0) = 0 holds for all /r e R. More¬ 
over, d^^Pj^(|Uo, 0,0) has a bounded inverse because 0 e cr (d^P(|Uo,0)) is an isolated 
eigenvalue. Consequently, there exists a neighborhood of (/Tq, 0) e R x PV and a function 
xj^(/r,xy) on that neighborhood that is uniquely determined by 

P^(/r,Xy,X^(/t,Xy)) = 0. 

Later on, we will crucially need the technical estimate 

= e>(||xy|| |p-Pol) +o(||xy||) (10.3.2) 

as /r ^ /to and xy ^ 0, but we postpone the proof of (10.3.2) until the end. 

For simplicity, let us only prove Krasnoselski’s Theorem for dimkerL(/to) = 1. Then the 
kernel is spanned by a single normalized vector v e kerL(/to), and we can write xy = sv for 
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some s e R. Combining (10.3.1a) with xj^(/t,sv) yields a scalar function f :RxR —> R, 
and we are looking for solutions to 

/(/x,s) := - ^v,F(/t,sv + x^(/x,sv))^ = 0. (10.3.3) 

We have added the prefactor ^/s to make limj^o d^f[^Q,s) / 0, but more on that later. As 
in the case of functions, we can Taylor expand 

F(/t,x) = F(/t,0) + (dF(|U, 0))x +R(/x,x) = L(/r)x +R(/t,x), (10.3.4) 

to first order; here, the remainder R(/t,x) = o(||x||) vanishes as x ^ 0. Plugging (10.3.4) 
into (10.3.3) yields 

/(/r,s)= (v,L(/r)v) + (v, L(/r)s"ix^(/t,sv)) + ^v,s"iR(/t,sv + x^(/t,sv)) ^. 

We obtain the branching solution through the Implicit Function Theorem: once we have 
proven 0) ^ 0 , we obtain a function s(/r) in the vicinity of /Tq so that / (/t,s(/t)) = 

0. And this function s(/t) also defines the branching solution 

x(/r) :=s(/t)v + Xj^(|U,s(/i)v) ^ 0 (10.3.5) 

satisfying F (/r, x(/r)) = 0 by construction. 

Hence, we compute the derivative 

= (v,5^L(/r)v) + (v,5^L(/x)s“^x^(/r,sv)) + (v,L(/t)s“^ a^x^(/x,sv)) + 

+ ^v,s“^ 5^R(/x,sv + Xj^(/t,sv)) ^ + 

+ (^v, (d^R(/t,sv + x^(/r,sv)))s“^ a^x^(/i,sv)^, (10.3.6) 

and use (10.3.2) in conjunction with assumption (iv), 

(v,5^L(/to)v) = (v,5^d^F(/to,0)v) ^ 0 , 

to deduce that all but the first term vanish in the limit /t ^ /Tq and s \ 0. First of all, 
(10.3.2) tells us that 

s“^x^(/t,sv),s“^ 5^x^(/x,sv) = OQfi - Fol) + o(l) 

goes to 0 in the limit /t ^ /Xq and s \ 0. Moreover, the assumption that 5^F exists 
as a Gateaux derivative means we can Taylor expand this function to first order, and 
the terms are just the /x-derivatives of (10.3.4) . Consequently, 5^L(/Xo) is bounded and 
3^R(/x,x) = o(||x||), and we deduce that the second and third term in (10.3.6) vanish. 
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Moreover, the terms involving R also vanish: by assumption also d^F has a Taylor 
expansion at (/t, 0 ), so that 

||^,x^^(/^,sv + x^(/t,sv))|| =o(||sv+x^(/i,sv)||) 

is necessarily small, and if we combine that with (10.3.2) , we deduce that this term van¬ 
ishes as fjL—* ands \ 0 . 

The last term can be treated along the same lines, d^F(|U, x) is in (jUg, 0) by assump¬ 
tion so that 


d^R(/i,sv + Xj^(p,,sv)) = djfF (in,sv + Xj^(|U,sv)) — d^L(/i)(sv -I- Xj^(/x,sv)) 


= 0 (||x||) 


M-»Mo.s\0 


0 . 


In conclusion, we have just shown 0) 7^ 0, the Implicit Function Theorem applies, 

and we obtain the branching solution (10.3.5) . 

All that remains are proofs of the estimates (10.3.2) . We use the Taylor expansion (10.3.^ 
to rewrite Fj^(/r,X||,xj^) = 0 as 

P"''L(/r)P"''Xj^(/r,sv) + P'‘~L(/j,)sv + P'‘~R(fj.,sv + Xj^(/t,sv)) = 0. 

L-*-(/r) := P-*-L(/r)P-*- is invertible on ranP-*- in a neighborhood of /Ug, because pi L(/i) 
is continuous and thus, the spectral gap around 0 does not suddenly collapse [Kat95, 
Chapter VII.3]. Now we can solve for xj^ and insert L(^q)sv = 0 free of charge, 

Xj^(pi,sv) = -L-‘-(pi)“^ P-*- ^L(pi)sv -FR(pi,sv + Xj^(pi,sv)) j 

= -L-^(pi)“^ P-^ (^(L(pi) - L(pio))sv -FR(pi,sv -F x^(pi,sv)) j (10.3.7) 
= C>(l|sv|| Im-MoI) +o(||sv + x^(pi,sv)||). 

The first term is clearly 0(s |pi — pioD- Initially, we merely obtain o(||sv + xj^(pi,sv)||) for 
the second term, but if xj^(pi,sv) ^ x 7 ^ 0 as pi ^ pig and s \ 0 , we would obtain 

X = o(||x||) 


which is absurd. Thus, the asymptotic behavior of xj^(pi,sv) is described by (10.3.2) . The 
proof for 5^xj^(pi,sv) is analogous, one starts from equation (10.3.7) and uses 

= -F-^(pi)“^ 5^L-^(pi)L-^(pi)"^ 

as well as the assumption that pi d^F(pi,x) is in (pio,0). This concludes the proof. □ 


2014.04.03 
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